FAIR workflow and services

Learn about the FAIRification workflow and use our services to make your data FAIR.

FAIR is not about “to be or not to be”. It involves a considerable number of aspects each of which can be fulfilled to some degree. Accordingly, in an assessment of 80+ energy databases, the EERAdata project has found substantial variance in their FAIR status (Schwanitz et al. 2022). To improve the FAIR status of your energy data, you may follow the workflow presented in this section step-by-step (Pre-FAIRification, FAIRification, Post-FAIRification). We have adopted a workflow that is inspired by Jacobsen et al. (2020) as well as Vasiljevic & Graybeal (2021) and have adapted it to the needs of the EERAdata Use Cases and the wider low-carbon energy research community.

 

You may start this workflow at the beginning – with Pre-FAIRification – or step directly in at the point that concerns you most.  

This first step requires having access to the data, securing the necessary resources (regarding time and money) and deciding on a realistic FAIRification plan. This step also requires a general knowledge and understanding of the data set, as well as being familiar with the FAIR data principles.

Objectives for FAIRification could be to increase the efficiency of using data from multiple sources, to meet specific requirements of publishers or funders, or to provide high quality data services for other users. It is highly recommended to do this in a team with both domain and data modelling expertise, and to provide corresponding resources. Moreover, focusing on only parts of your dataset at a time will add to the FAIRification efficiency.

The FAIR data principles comprise a set of criteria which, in practice, are usually fulfilled to a greater or lesser extent: Our recent study has revealed that many of the tested databases completely fail to comply with specific FAIR criteria (see figure below from Schwanitz et al. 2022):

Select one or more criteria and define your FAIRification objective for starting this workflow.

  1. F1: (Meta) data are assigned globally unique and persistent identifiers
  2. F2: Data are described with rich metadata
  3. F3: Metadata clearly and explicitly include the identifier of the data they describe
  4. F4: (Meta)data are registered or indexed in a searchable resource
  5. A1: (Meta)data are retrievable by their identifier using a standardised communication protocol
  6. A1.1: The protocol is open, free and universally implementable
  7. A1.2: The protocol allows for an authentication and authorisation procedure where necessary
  8. A2: Metadata should be accessible even when the data is no longer available
  9. I1: (Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation
  10. I2: (Meta)data use vocabularies that follow the FAIR data principles
  11. I3: (Meta)data include qualified references to other (meta)data
  12. R1: (Meta)data are richly described with a plurality of accurate and relevant attributes
  13. R1.1: (Meta)data are released with a clear and accessible data usage license
  14. R1.2: (Meta)data are associated with detailed provenance
  15. R1.3: (Meta)data meet domain-relevant community standards

Further comments on FAIR data principles can be found on the EERAdata wiki.


FAQ:

  • What is Metadata?  Generally speaking, metadata is data that provides information about the data at stake. Metadata is therefore at the core of making data FAIR. Look up definitions and types of metadata.

The second step consists of assessing your dataset with respect to the FAIR criteria given in step 1. In this way, the FAIR gaps can be identified and subsequent FAIRification steps prioritized (see, for instance, Sempreviva et al. 2017, Wilkinson et al. 2018). A number of both manual and machine assessment tools are available to achieve this, some of which have been tested in application to a large number of energy-related databases within the EERAdata project (Schwanitz et al. 2022).

Manual assessment tools: DANS Self-Assessment Tool, ARDC FAIR Self-Assessment Tool

Machine assessment tools: FAIRsharing Evaluation Services, F-UJI Automated FAIR Data Assessment Tool

See examples for analyzing (meta)data: Go to Use Cases


FAQ:

  • What is Metadata?  Generally speaking, metadata is data that provides information about the data at stake. Metadata is therefore at the core of making data FAIR. Look up definitions and types of metadata.

The third step is to assign your data with a set of metadata describing the content of the data. To improve on findability and interoperability you are advised to use existing metadata standards. The EERAdata Community Platform fosters the use of the Dublin Core Metadata Terms.

Using our metadata creator, you are able to: 

1. Assign standardised metadata to your data selecting a suitable standard (Dublin Core in this example)

2. Automatically create a machine-readable metadata file and download it,  so that you can add it to your data resource (in json format in this example). 

→ Access the Metadata Creator


FAQ:

  • What are metadata standards and why do we need them? A metadata standard is a requirement which is intended to establish a common understanding of the meaning or semantics of data from different sources, to ensure correct and proper use and interpretation of the data by its owners and users. Whether or not your metadata already adhere to a certain domain-specific standard, you might be interested to map your metadata to a common standard. Consider our list of energy-relevant metadata standards.
  • Which metadata standard does the EERAdata Community Platform use? Starting with administrative and legal metadata, we adopt the specifications that are developed and maintained by the Dublin Core Metadata Initiative

The fourth step is to make your data findable through a semantic search. This involves two aspects. First, your rich metadata has to adhere to a common metadata standard and second, it must be available in a machine-readable format, like json-LD or RDF. The EERAdata Community Platform offers both services through its Metadata Mapper.

With our Metadata Mapper you are able to:

1. Map your metadata to the standard used on the EERAdata Community Platform. If your (non-standardised) metadata terms are already given in a standard file format, you may upload this file and map your terms to the standard terms provided on the platform.

2. Automatically create a standardised metadata file and download it in a machine-readable format, to make it open.

→ Access the Metadata Mapper 


FAQ:

What makes my metadata linkable?

The fifth step is to clarify the way or extent under which the data can be re-used. This is done by specifying a suitable licence in your metadata file. In the absence of a licence, the author still retains proprietary copyright, and the conditions under which the data can be used are unclear.

Standard licences provide pre-defined sets of conditions, for both providers and users. The most common licenses for a given artifact can be determined by its type: data, code, documentation, or other generic digital “creative work” (reports and figures). Special copyright rules may apply to databases, which, under European law, are protected in their own right, irrespective of the status of the data they contain.

The most commonly and widely used data licences are the suite of Creative Commons (CC) copyright licences which clearly describe how data can and cannot be reused. The CC licences are irrevocable. This means that once you receive material under a CC licence, you will always have the right to use it under those licence terms, even if the licensor changes his or her mind and stops distributing under the CC licence terms. Of course, you may choose to respect the licensor’s wishes and stop using the work, but once a dataset has been issued a CC licence, it cannot be revoked afterwards.

A scientific dataset, which other researchers may build upon or which is published together with a scientific article, is usually published under the CC-BY licence.

More on data licences can be found on HOW TO FAIR and openmod.

*** UPCOMING EVENT ***

3-5 October 2022, EERAdata Workshop 5

Sustainable models for FAIR and open low carbon energy research data: Business models, licensing and certification

 


FAQ:

The sixth step is to publish the metadata (and the data, if desired). An increasingly popular way to deploy and host FAIR data is to get involved in a suitable open data initiative (e.g., EOSC), or to upload your (meta)data to an open repository like zenodo (public) or dataverse (private).

More to come…


FAQ:

The seventh and last step comprises a reconsideration of the FAIR data principles to validate or further improve the FAIR status of the (meta)data. If necessary, the FAIRification workflow can be restarted.

Manual assessment tools: DANS Self-Assessment Tool, ARDC FAIR Self-Assessment Tool

Machine assessment tools: FAIRsharing Evaluation Services, F-UJI Automated FAIR Data Assessment Tool

Extended tools list…


FAQ: