UC3

From EERAdata Wiki
Revision as of 11:53, 4 June 2020 by Guest (talk | contribs) (Add RDA metadata link)
Jump to: navigation, search

Material solutions for low carbon energy

General description of use case

The European Union has prioritized materials as a Key Enabling Technology (KET) to enable the transition to a knowledge-based, low carbon, resource-efficient economy and has proposed a materials roadmap to address the technology agenda of the SET-Plan. With the imperative to change the energy technology mix to respond to the challenges of decarbonization and security of energy supply, the need for new materials and processing routes is overriding. New efficient and cost-competitive energy technologies are urgently needed. In this respect, materials research and control over materials resources are becoming increasingly important in the current global competition for industrial leadership in low carbon technologies. Two EERA Joint Programs (Nuclear Materials and AMPEA) are directly involved in materials research for energy applications and several other Joint Programs are interested in new materials to improve the efficiency of energy technologies. To speed-up the discovery of new materials for energy technologies, Innovation Challenge No. 6 of the Mission Innovation Initiative is devoted to the discovery of new materials. This Innovation Challenge aims at accelerating the innovation process for high performance, low-cost clean energy materials and automating the processes needed to integrate these materials into new technologies. The challenge is to combine advanced theoretical and applied physical chemistry/materials science data, as well as data on the life cycle of materials and material compounds with next-generation computing infrastructures, artificial intelligence, and robotics tools. The goal is to create a fully integrated approach.

Materials research is characterized by strong multidisciplinary research in which both, converging technologies and cooperation, should be exploited to speed up application-oriented research activities. This is not always the case due to the extremely different technological fields where materials are employed (in the energy sector and beyond). Indeed, each technological field has often developed its own terminology, experimental set-ups, research procedures, and, consequently, its own standards in data management. Therefore, it can be argued that in the field of materials for energy data the state of the art is the following: Openness is low to medium, re-usability is low to medium, and barriers are high. Finally, actions put in place are rare (low to medium). The availability of an open data infrastructure covering as many research fields as possible could increase opportunities to develop new research programs, defragment the materials for energy communities, avoid the duplication of research activities, speed-up the discovery of materials, and increase the understanding of how energy systems could better benefit from existing and new materials. Many databases are already available but a large amount of data is currently produced in laboratories, not organized to be shared. Moreover, databases do not follow common formats preventing inter-operability and re-usability. In view of the intended automatization, it is important that access to databases is machine-actionable. Finally, due to the strong connection with industries, questions of open data access need to be explored and new business models need to be developed.

Database of interest

List of databases related to WP6

Networks

  • EMIRI (The Energy Materials Industrial Research Initivative), Simon Perraud
  • EUMAT (European Technology Platform for Advanced Materials), Amaya Igartua
  • EMMC ASBL (European Materials Modeling Council), Nadja Adamovic & Gerhard Goldbeck
  • MARVEL , Nicola Marzari or Giovanni Pizzi (GoFAIR)
  • Materials Genome (USA), Stefano Curtarolo (Duke)
  • NOMAD (GoFAIR)
  • EERA JPs
    • AMPEA
    • Nuclear Materials
    • Photovoltaics
    • Energy Storage
    • Fuel Cells and Hydrogen
    • Wind

METADATA

Information about metadata:

Innovation in metadata design, implementation & best practice (Tim Austin):

The properties of the DataCite Metadata Schema (Tim Austin):

In summary, we could say:

  • Findable: unique names, human-readable descriptions
  • Accessible: URL, accessible via API
  • Interoperable: typed, extensible schema → ontologies
  • Reusable: hierarchical schema → data-analytics

From Ghiringhelli talk, an ontology is:

a formal machine readable
representation concepts, properties, relations, functions,
constraints, axioms are explicitly defined
of the knowledge domain specific
of a community consensual
for a purpose (competency) question driven

List of selected databases

During the first workshop (see notes from Day 2), the following databases were selected to analyze and improve their compliance with FAIR and Open data principles:

Name of database Short description Reasoning of choice Current state of FAIR/O principles Target of FAIR/O to achieve within EERAdata
Database 1 Write a short description, e.g., "Database 1 is about XXX, containing XXX data, covering the period xxx." Summarize shortly the main reasons, why this DB was chosen. Link to the discussion page of WS1UC3. What is the current FAIR/O state for this database. Summarize here. In case more space is needed, link to a section of the discussion page of WS1UC3. What FAIR/O target was decided?
NOMAD Data coming from molecular modeling, with a particular focus on ab-initio calculations. It collects data from other American and European DBs. It belongs to an HPC European Centre of Excellence. There are ongoing projects to improve metadata The FAIR/O state is advanced and based on their own metodology and models
MATDB The European Commission JRC ODIN Portal hosts a number of scientific databases, one of which is MatDB, which is a database application designed to store mechanical test data coming from tests performed in accordance with mechanical testing standards. The metadata are organised into categories relevant to materials testing, so that there are main entities for source (i.e. provenance), material (i.e. production, heat treatment, microstructure, etc), specimen, test condition, documents and test result. The access management model supports both open access and restricted access. With a view to FAIR compliance, the database supports data citation (using the DataCite framework) and interoperability standards for test data (hosted at CEN). MatDB supports various features aligned to the FAIR data principles, including data citation e.g. Ruiz, A (2018): Nanoindentation (single cycle) test data for Gr. 91 material at 23 °C and maximum indenter force of 1.00332 mN, version 1.0, European Commission JRC (for findability and access), interoperability standards for mechanical test data and quality assurance workflows for reuse.
Urban Mine Platform This Urban Mine Platform provides all readily available data on products put on the market, stocks, composition and waste flows for electrical and electronic equipment (EEE), vehicles and batteries for all EU 28 Member States plus Switzerland and Norway. The knowledge base is complemented with an extensive library of more than 800 source documents and databases. Platform provides the ability to view the metadata, methodologies, calculation steps and data constraints and limitations. The Platform provides full access to the required data and information. A rich set of metadata. Includes a centralised database containing all readily available data. It is an open dataset that almost meets FAIR criteria. Findable: F.1.; F.2.; F.3. – YES. F.4. – N/A. Accessible: A.1.; A.1.1.; A.1.2. – YES; A.2. – N/A. Interoperable: I.1.; I.2.; I.3. – YES; Reusable: R.1.; R.1.1.; R.1.2.; R.1.3 – YES R1.1. improving the related area in order to obtain a clear and accessible licence to use the data. F.4 and A.2. - the lack of a clear answer suggests a need to improve information in this area. For consideration F1: (Meta) data are assigned globally unique and persistent identifiers – there is just URL, could be DOI?
Photovoltaic Geographical Information System (PVGIS) JRC Ispra: Photovoltaic Geographical Information System (PVGIS) providing free and open access to:
  • PV potential for different technologies and configurations of grid connected and stand alone systems.
  • Solar radiation and temperature, as monthly averages or daily profiles.
  • Full time series of hourly values of both solar radiation and PV performance.
  • Typical Meteorological Year data for nine climatic variables.
  • Maps, by country or region, of solar resource and PV potential ready to print.
  • PVMAPS software includes all the estimation models used in PVGIS.
Crystallography Open Database Open-access database of crystal structures of organic, inorganic and metal–organic compounds and minerals. It is a widely used and rich database for structural information on materials. Findable: F1-YES; F2-YES; F3-YES; F4-YES. Accessible: A1-YES; A1.1-YES; A1.2-N/A; A2-YES. Interoperable: I1-N/A; I2-N/A; I3-N/A; I4-N/A. Reusable: R1-YES; R1.1-N/A; R1.2-YES; R1.3-YES

Metadata assessments

Databases above were assessed with respect to their current meta practices. The table below summarizes the current state and issues identified during WS 1:

Name of database Type of metadata provided Extend of metadata provided Level of implementation of FAIR/O principles Frameworks for metadata used Technical implementation of metadata
Database 1 Which types of metadata are covered? Administrative, descriptive, structural, provenance of data, etc.? Summarize: Is it rich or basic metadata provided for each of the types? Check the Wilkinson criteria for metadata and summarize here. In case more space is needed, link to a section of WS1UC3. What framework is used, e.g., controlled vocabulary, taxonomy, thesaurus, ontology? How are metadata implemented? As xml, plain text, RDF, etc.
MATDB Noting that the primary data object is a combination of the the test result + material (i.e. production, composition, heat treatment, microstructure, etc.) + specimen + test condition + documents entities, the source (i.e. bibliographic and provenance) entity corresponds to the metadata. Approximately twenty metadata fields exist, noting that not all are mandatory. The metadata for approximately 25% of the content of the database is broadly compliant with the FAIR data principles (details at WS1UC3 - Notes from afternoon session ). This is because approximately 25% of the content is enabled for citation and hence complies with the DataCite metadata schema , which itself is aligned with the Dublin Core . Thesaurus and procedural standards XML, PDF, MS Excel
NOMAD Metadata includes: administrative (location, access privileges, who, when, where) and provenance of data (how). Each metadata, besides its name, can have up to six additional attributes: Type, Description, Data Type, Shape, Units, Derived (optional). Relations between metadata are visualized by graph diagrams. More info at NOMAD Meta Info page. Extensive metadata F1;F2;F3;F4 100%. A1;A1.1 100%, A1.2 0%, A2 100%. I1;I2 100%, I3 N/A. R1;R1.1;R1.2 100 %, R1.3 50%. Taxonomy, Onthology json file
Urban Mine Platform Metadata includes: administrative (e.g. contact info), descriptive (e.g. use limitation, data quality statement), provenance (e.g. resource), structural and others like file identifier, descriptive keywords etc. Extensive metadata is provided in form of basic metadata and full metadata Findable: F.1.; F.2.; F.3. – YES. F.4. – N/A. Accessible: A.1.; A.1.1.; A.1.2. – YES; A.2. – N/A. Interoperable: I.1.; I.2.; I.3. – YES; Reusable: R.1.; R.1.1.; R.1.2.; R.1.3 – YES Taxonomy XML file
PVGIS
Crystallography Open DB Single 7-digit identifier assigned to the data. Full history of data additions and changes is provided (with dates, commiters, modified files, etc). Metadata include: data upload date, revision number, URL of the data file in the repository, source publication author names and contacts (address, tel, email, etc), source publication title, journal, issue, doi, etc, method of structure determination, performed automoatic conversions Extensive metadata. Findable: F1-YES; F2-YES; F3-YES; F4-YES. Accessible: A1-YES; A1.1-YES; A1.2-N/A; A2-YES. Interoperable: I1-N/A; I2-N/A; I3-N/A; I4-N/A. Reusable: R1-YES; R1.1-N/A; R1.2-YES; R1.3-YES Taxonomy RDF file

GOFAIR implementation projects

Conferences

Virtual Conference on A FAIR Data Infrastructure For Materials Genomics 3 - 5 June, 2020

  • The focus of the conference is to describe the new horizons that can be reached by a FAIR data infrastructure for materials genomics. This conference is organized by the association FAIR-DI e.V.
  • Notable groups participating we should link with: NOMAD Center of Excellence, BIG-DATA ANALYTICS FOR MATERIALS SCIENCE
  • Existing metadata or other standards we should have in mind and link with: FAIR-DI e.V; OPTIMADE Consortium.
  • The video of the plenary and invited talk are available at the Conference Program page
  • Relevant Talks:
    • New Horizons for Materials Research - Role of FAIR Data, by Claudia Draxl.
    • The US Material Genome Initiative and the Materials Data Infrastructure, by James Warren.
    • Community-driven Metadata and Ontologies for Materials Science and their Key Role in Artificial-intelligence Tools, by Luca Ghiringhelli.
    • Reproducibility of Materials Simulations and Accessibility to Data, by Giulia Galli.
    • Others will follow
  • Relevant Posters at the Poster Session page
    • N. 18 Ontologies in Computational Materials Science
    • N. 25 NOMAD Repository and Archive
    • N. 62 Web-based artificial intelligence tools for materials science: the NOMAD analytics toolkit
    • N. 38 Materials Data Platform System toward implementation of FAIR data principle
    • Others will follow

Literature

  • Data-Driven Materials Science: Status, Challenges, and Perspectives. Lauri Himanen, Amber Geurts, Adam Stuart Foster, and Patrick Rinke. Adv Sci 2019, 6, 1900808.
  • Towards efficient data exchange and sharing for big-data driven materials science: metadata and data formats. Luca M. Ghiringhelli, Christian Carbogno, Sergey Levchenko, Fawzi Mohamed, Georg Huhs, Martin Lüders, Micael Oliveira & Matthias Scheffler. npj Computational Materials volume 3, Article number: 46 (2017).
  • FAIR from GOFAIR website: GOFAIR
  • FAIRification process from GOFAIR website: GOFAIR
  • Wang, S. (2016) Green practices are gendered: Exploring gender inequality caused by sustainable consumption policies in Taiwan; Energy Research & Social Science, 18, 88-95. [[1]]. Abstract: In the context of climate change, governments and international organizations often promote a “sustainable lifestyle.” However, this approach has been criticized for underestimating the complexity of everyday life and therefore being inapplicable to households and consumers. In addition, procedures for promoting sustainable consumption seldom incorporate domestic workers’ opinions and often increase women’s housework loads. This article employs a practice-based approach to examine the “Energy-Saving, Carbon Reduction” movement, a series of sustainable consumption policies that have been advocated by the Taiwanese government since 2008. The goal of the movement is to encourage an eco-friendly lifestyle. On the basis of empirical data collected through ethnographic interviews, this article argues that existing policies unexpectedly increase women’s burdens and exacerbate gender inequality.
  • Ly, L. T. et al. (2015) Compliance monitoring in business processes: Functionalities, application, and tool-support, Information Systems 54, 209-234, [2]. Abstract: In recent years, monitoring the compliance of business processes with relevant regulations, constraints, and rules during runtime has evolved as major concern in literature and practice. Monitoring not only refers to continuously observing possible compliance violations, but also includes the ability to provide fine-grained feedback and to predict possible compliance violations in the future. The body of literature on business process compliance is large and approaches specifically addressing process monitoring are hard to identify. Moreover, proper means for the systematic comparison of these approaches are missing. Hence, it is unclear which approaches are suitable for particular scenarios. The goal of this paper is to define a framework for Compliance Monitoring Functionalities (CMF) that enables the systematic comparison of existing and new approaches for monitoring compliance rules over business processes during runtime. To define the scope of the framework, at first, related areas are identified and discussed. The CMFs are harvested based on a systematic literature review and five selected case studies. The appropriateness of the selection of CMFs is demonstrated in two ways: (a) a systematic comparison with pattern-based compliance approaches and (b) a classification of existing compliance monitoring approaches using the CMFs. Moreover, the application of the CMFs is showcased using three existing tools that are applied to two realistic data sets. Overall, the CMF framework provides powerful means to position existing and future compliance monitoring approaches.