JSON-LD for FAIRification

William C. Oltjen, Xuanji Yu, Laura S. Bruckman, Roger H. French

2023-08-09

Background About This Project as a Whole

FAIR data is data that is Findable, Accessible, Interoperable, and Reusable. Introduced through the publication of Wilkinson et al [1], the FAIR principles guide correct stewardship of data sets. Essentially, FAIR data is data that can be easily shared and explained between different groups. There has been an extensive push throughout the scientific community to make metadata “FAIR” recently, as publishers, science funders, and government agencies have begun to establish requirements for the proper management of metadata. As such, we have been implementing FAIR principles into the ingestion of our data through the use of a standardized Javascript Object Notation for Linked Data (JSON-LD) template to store our metadata. We present the FAIRmaterials R and Python packages as a starting point for FAIRifying metadata in a number of different fields in which we are actively pursuing research topics. The fact that this is a starting point must be emphasized. These templates are by no means a final product, and we are more than open for collaboration to add new keys and frameworks that were missed in the first full release of this package. This is meant to be the start of a community effort pushing for the use of FAIR guidelines in scientific research.

Why a JSON-LD

We chose a JSON-LD format to store our metadata because it is a recommendation by the World Wide Web Consortium (W3C). Led by Tim Berners-Lee who invented the World Wide Web, the W3C is a governing board that proposes standards for the Web such as HTML, XML, CSS, and of course, JSON-LD. A major push by the W3C is the idea of a “Semantic Web.” In 2001, Tim Berners-Lee proposed the idea of a Semantic Web in a sensationalized article published in Scientific American [2]. The Semantic Web was to be different from the World Wide Web in that it was to be readable both by humans and machines. The promises of such a Web were vast. Computers would be able to be agents of themselves, searching the web in an automated way for whatever information they wanted. In order to enable this, data would need to be “linked” together in a massive web. A JSON-LD is a special format of a JSON where these kinds of linkages are made. The basic objects that exist in a regular json document are shown in the figure below.
Types of Objects in a JSON

A JSON-LD uses these basic objects and adds a context to them that links ideas together.

Foundation in an Ontology

The outline that we use to create our JSON-LD files comes from a structure that we create through the formation of an ontology. An ontology is a formal dictionary of terms for a given industry or field that shows how the terms are related through densely interconnected web. Part of the point of doing this is to standardize terms for data by defining how variables should be defined across a given industry. For a photovoltaic system, for example, latitude is to be spelled exactly latitude (not lat, latd, etc), and it is to be measured in degrees. This way, there is no ambiguity. An ontology not only defines terms, but it defines a structure for the metadata as well. An ontology is the blueprint for linking metadata terms together through the creation of a knowledge graph. When an ontology is filled in with real data, it becomes a knowledge graph. For our purposes right now though, the ontology purely exists to provide structure for our metadata. We have designed our ontologies with structures informed by the Basic Formal Ontology (BFO) in order to adhere to standards.

Web Ontology Language (OWL)

The W3C Web Ontology Language (OWL) is a Semantic Web language designed to represent rich and complex knowledge about things, groups of things, and relations between things. OWL is a computational logic-based language such that knowledge expressed in OWL can be exploited by computer programs, e.g., to verify the consistency of that knowledge or to make implicit knowledge explicit. OWL documents, known as ontologies, can be published in the World Wide Web and may refer to or be referred from other OWL ontologies. [3] We save our ontologies as .owl files.

Nomenclature

We proposed the following naming schema to define file names.

To illustrate

References

[1] M. D. Wilkinson, M. Dumontier, I. J. Aalbersberg, G. Appleton, M. Axton, and A. Baak, “The FAIR Guiding Principles for scientific data management and stewardship,” Scientific Data, vol. 3, no. 1, pp.1–9, Mar. 2016

[2] cT. BERNERS-LEE, J. HENDLER, and O. LASSILA, “THE SEMANTIC WEB,” Scientific American, vol. 284, no. 5, pp. 34–43, 2001.

[3] https://www.w3.org/OWL/