WS1UC3

From EERAdata Wiki
Revision as of 15:38, 3 June 2020 by Massimoc (talk | contribs) (Discussion on the choice of databases:)
Jump to: navigation, search

This page provides space for notes from workshop discussions of use case 3 during Day 2 of WS1.

Interactive elements

The discussions can also benefit from considering the EERAdata storyboard and/or slido comments. See WS1#Interactive_elements

Lessons learned during WS1 Day1

  1. What is the take-home message for you?
  2. What do you suggest doing tomorrow?
  3. Have you seen today best practices from other communities outside of your use-case?

Notes from the morning session

The morning session is dedicated to database discussions with the aim to select 3-5 databases from below's list of databases for further examination. The first table of the WIKI page is filled out by noon.

Participants

  • Tim Austin (JRC), Ihssen Holger (EERA), Theodoros Dimepoulos (AIT), Karl Berger (AIT), Manfred Paier (AIT), Constantin V. Beck (HVL), Mehran Ziaabadi (HVL), Karolina Jadeiko-Skubes (GIG), Francesco Buonocore (ENEA), Massimo Celino (ENEA)
  • The AIT group is working mainly in the field of PV, both from the point of view materials (low toxicity, usability, ...) and from the point of view of systems and module (engineering, efficiency, quality, ...)
  • JRC is managing a large DB about structural materials. They pay a lot of attention to FAIR principle:
    • access: not all the information in the DB are open. This helped private companies to provide and reuse data.
    • interoperability: standardization procedures in collaboration with CEN.
    • Reusability: this is related to quality issues. Reviewers or commettees are often involved in the definition of the quality level.
  • Holger: he is working for a new project where data plays a fundamental role. It's about Materials Accelerated Platform (MAP). MAP has four components: modeling, robotics, experimental characterizations, artificial intelligence. There are already some working platforms in Europe and Canada, for example in the field of reduction of CO2. Other MAPs are about thermoelectricity, concrete and batteries (BIGMAP project in Germany).


Draft list of databases

  • Add bulleted list item or table entry
Name of database Description Reasoning of choice
NOMAD Data coming from molecular modeling It belongs to an HPC European Centre of Excellence
MATWEB MatWeb is a searchable database of engineering materials.The database includes thermoplastic and thermoset polymers, metals and alloys, ceramics, wood, fibers, stone, lubricants, inorganic salts, and other engineering materials. The database includes more than 100000 material data sheets Highly recognizable and popular database directly related to materials. Most data sheets of materials are provided by manufacturers / industry suppliers.
Urban Mine Platform The ProSUM project has developed the open-access Urban Mine Platform (UMP). This dedicated web portal is populated by a centralised database containing all readily available data on market inputs, stocks in use and hibernated, compositions and waste flows of electrical and electronic equipment (EEE), vehicles and batteries (BATT) for all EU 28 Member States plus Switzerland and Norway. The knowledge base is complemented with an extensive library of more than 800 source documents and databases. Platform provides the ability to view the metadata, methodologies, calculation steps and data constraints and limitations. Full access to the required data and information. Easy search. A rich set of metadata.
MatDB The European Commission JRC ODIN Portal hosts a number of scientific databases, one of which is MatDB, which is a database application designed to store mechanical test data coming from tests performed in accordance with mechanical testing standards. The metadata are organised into categories relevant to materials testing, so that there are main entities for source (i.e. provenance), material (i.e. production, heat treatment, microstructure, etc), specimen, test condition, documents and test result. The access management model supports both open access and restricted access. With a view to FAIR compliance, the database supports data citation (using the DataCite framework) and interoperability standards for test data (hosted at CEN). MatDB supports various features aligned to the FAIR data principles, including data citation e.g. Ruiz, A (2018): Nanoindentation (single cycle) test data for Gr. 91 material at 23 °C and maximum indenter force of 1.00332 mN, version 1.0, European Commission JRC (for findability and access), interoperability standards for mechanical test data and quality assurance wokflows.
FactSage Browser Thermochemical Data
NIST X-ray Photoelectron Spectroscopy (XPS) Database
AFLOWLIB AflowLIB (Automatic Flow LIB) is a software framework for high-throughput calculation of crystal structure properties of alloys, intermetallics and inorganic compounds. It is a rich database.
OQMD The Open Quantum Materials Database (OQMD) is a high-throughput database currently consisting of nearly 300,000 density functional theory (DFT) total energy calculations of compounds from the Inorganic Crystal Structure Database (ICSD) and decorations of commonly occurring crystal structures. It is a rich database.
Database of refractive indices Complex refractive index of inorganic and organic materials
Materials Project The Materials Project provides open web-based access to computed information on known and predicted materials as well as powerful analysis tools to inspire and design novel materials.
Organic Materials Database The organic materials database is an open access electronic structure database for 3-dimensional organic crystals, developed and hosted at the Nordic Institute for Theoretical Physics – Nordita. It provides tools for search queries based on data-mining and machine learning techniques.
Crystallography Open Database Open-access database of crystal structures of organic, inorganic and metal–organic compounds and minerals. It is a widely used and rich database for structural information on materials.
Photovoltaic Geographical Information System (PVGIS)

JRC Ispra: Photovoltaic Geographical Information System (PVGIS) providing free and open access to:

  • PV potential for different technologies and configurations of grid connected and stand alone systems.
  • Solar radiation and temperature, as monthly averages or daily profiles.
  • Full time series of hourly values of both solar radiation and PV performance.
  • Typical Meteorological Year data for nine climatic variables.
  • Maps, by country or region, of solar resource and PV potential ready to print.
  • PVMAPS software includes all the estimation models used in PVGIS
EU COST Action Pearl PV (CA16235) Photovoltaic Geographical Information System (PVGIS) Established to publish and upload data of monitored installed PV systems and to quantitatively evaluate the long-term performance and reliability of these PV systems in Europe and elsewhere

Discussion on the choice of databases:

  • Valeria suggests to have a look to Critical Raw Materials CRM
  • NOMAD is a very interesting DB but it is only about Computational Materials Science at the atomic scale. It is mainly focussed on ab-initio calculations. However there is an interesting ongoing activities in FAIRification. However some very popular computational DBs are already included in NOMAD
  • We should select DB covering different lenght scales (from microscopic to devices) and different applications (PV, WIND, others..).
  • DBs should be select allowing interoperability among them, for example a DB on crystallography could be used with another that provides band structures and related properties.
  • MATWEB needs registration and it has a premium service. That's from engineers to engineers. Metadata are not rich.
  • URBAN MINE PLATFORM has rich metadata, however dowload is not that easy. No registration is needed. Metadata information can be found here

Notes from the afternoon session

The afternoon session is dedicated to discussing metadata for the selected databases. The aim of the afternoon session is to fill out table 2 of the wiki page for the use case and to decide what to report back from the use case to the plenary session the next day (again, see the WIKI template for a suggested structure). Thus, at the end of the day, the WIKI page for the use case is complete.

  • Bulleted list item

Details of FAIR criteria assessments

  • MatDB
    • F1. (meta)data are assigned a globally unique and eternally persistent identifier - yes, either a DataCite DOI or a (resolvable) database technical key;
    • F2. data are described with rich metadata - yes because the MatDB source entity extends to fields about the organisation, projects, data creators, etc.;
    • F3. (meta)data are registered or indexed in a searchable resource - yes, initially at DataCite, with records propagating to other metadata repositories such as the EU Open Data Portal, OpenAIRE and the JRC Data Catalogue;
    • F4. metadata specify the data identifier - yes in the circumstance that data are enabled for citation;
    • A1 (meta)data are retrievable by their identifier using a standardized communications protocol - ;
    • A1.1 the protocol is open, free, and universally implementable - ;
    • A1.2 the protocol allows for an authentication and authorization procedure, where necessary - ;
    • A2 metadata are accessible, even when the data are no longer available - ;
    • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation - ;
    • I2. (meta)data use vocabularies that follow FAIR principles - ;
    • I3. (meta)data include qualified references to other (meta)data - ;
    • R1. meta(data) have a plurality of accurate and relevant attributes - yes in the circumstance that data are enabled for citation because the metadata are compliant with the DataCite v.4 metadata schema;
    • R1.1. (meta)data are released with a clear and accessible data usage license - yes in the circumstance that data are enabled for citation because the DataCite v.4 metadata schema includes a 'rights' field;
    • R1.2. (meta)data are associated with their provenance - yes because the MatDB source entity includes fields about the organisation, projects, data creators, etc.;
    • R1.3. (meta)data meet domain-relevant community standards - yes in the circumstance that data are enabled for citation because the DataCite v.4 metadata schema is aligned to Dublin Core.

What to report back to the plenary on Day 3?

  • Databases selected (names and short reasoning)
  • Main insights from discussions
  • Suggested next steps
  • Other issues