FAIR guiding principles

From EERAdata Wiki
Revision as of 07:51, 2 November 2020 by Valerias (talk | contribs)
Jump to: navigation, search

These principles are outlined in Box #2 of Wilkinson et al.[1]

To be Findable:

  • F1. (meta)data are assigned a globally unique and persistent identifier
  • F2. data are described with rich metadata (defined by R1 below)
  • F3. metadata clearly and explicitly include the identifier of the data it describes
  • F4. (meta)data are registered or indexed in a searchable resource

To be Accessible:

  • A1. (meta)data are retrievable by their identifier using a standardized communications protocol
  • A1.1 the protocol is open, free, and universally implementable
  • A1.2 the protocol allows for an authentication and authorization procedure, where necessary
  • A2. metadata are accessible, even when the data are no longer available

To be Interoperable:

  • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
  • I2. (meta)data use vocabularies that follow FAIR principles
  • I3. (meta)data include qualified references to other (meta)data

To be Reusable:

  • R1. meta(data) are richly described with a plurality of accurate and relevant attributes
  • R1.1. (meta)data are released with a clear and accessible data usage license
  • R1.2. (meta)data are associated with detailed provenance
  • R1.3. (meta)data meet domain-relevant community standards

In order to react to inconsistent interpretations of these principles, Jacobsen et al. [2] specified implications for possible implementations of these principles, starting from the objectives behind the FAIR principles:

Findability: Digital resources should be easy to find for both humans and computers. Extensive machine-actionable metadata are essential for automatic discovery of relevant datasets and services, and are therefore an essential component of the FAIRification process ...

Accessibility: Protocols for retrieving digital resources should be made explicit, for both humans and machines, including well-defined mechanisms to obtain authorization for access to protected data.

Interoperability: When two or more digital resources are related to the same topic or entity, it should be possible for machines to merge the information into a richer, unified view of that entity. Similarly, when a digital entity is capable of being processed by an online service, a machine should be capable of automatically detecting this compliance and facilitating the interaction between the data and that tool. This requires that the meaning (semantics) of each participating resource – be they data and/or services service – is clear.

Reusability: Digital resources are sufficiently well described for both humans and computers, such that a machine is capable of deciding: if a digital resource should be reused (i.e., is it relevant to the task at-hand?); if a digital resource can be reused, and under what conditions (i.e., do I fulfill the conditions of reuse?); and who to credit if it is reused.

In the paper of Jacobsen [2], recommendations are given for implementation of the 15 principles. Here a few quotes:

For #F1:

A common example of a useful identifier is the Digital Object Identifier (DOI) which is guaranteed by the DOI specification to be globally unique and persistent. DOIs provide an additional service, under principle A1, of being able to direct calls to the source data to the location of that data, even if the identified data moves. This ensures that identifiers are stable and valid beyond the project that generated them. In some circumstances, again with DOIs being an example, third-party persistent identifiers may also provide support for principle A2 (that metadata exists beyond the lifespan of the data) since these identifiers may still be responsive to Web calls, and be capable of providing metadata, even if the source resource is no longer active.

For #F2:

Whereas principle F1 enables unambiguous identification of resources of interest, principle F2 speaks to the ability to discover a resource of interest through, for example, search or filtering.... It is a challenge for each domain-specific community to define their own metadata descriptors necessary or optimizing findability. The minimal “richness” of the metadata should be defined so that it serves its intended purpose and should also be guided by the requirements of the other FAIR principles.... Examples of metadata schemata can be found in FAIRsharing and include for instance the Data Documentation Initiative (DDI), the HCLS Dataset Descriptors, and many domain-specific “minimal information” models that have been invented.

For #F3:

It is a challenge to each community to choose a machine-actionable metadata model that explicitly links a resource and its metadata. An example of a technology that provides this link is FAIR Data Point, which is based on the Data Catalogue model (DCAT) that provides not only unique identifiers for potentially multiple layers of metadata, but also provides a single, predictable, and searchable path through these layers of descriptors, down to the data object itself.


Notes

Template:Reflist
  1. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci. Data 3:160018 doi: 10.1038/sdata.2016.18. Template:Cite journal
  2. 2.0 2.1 Jacobsen et al. FAIR Principles: Interpretations and Implementation Considerations. Data Intelligence 2:10-29 (2020) doi: 10.1162/dint_r_00024.