WS1UC3

From EERAdata Wiki
Jump to: navigation, search

This page provides space for notes from workshop discussions of use case 3 during Day 2 of WS1.

Interactive elements

The discussions can also benefit from considering the EERAdata storyboard and/or slido comments. See WS1#Interactive_elements

Lessons learned during WS1 Day1

  1. What is the take-home message for you?
    1. there is not an unique method to define metadata
  1. What do you suggest doing tomorrow?
    1. review different types of DBs trying to find a common line among them
    2. facilitate an open discussion about our own experience in the field
  1. Have you seen today best practices from other communities outside of your use-case?

Notes from the morning session

The morning session is dedicated to database discussions with the aim to select 3-5 databases from below's list of databases for further examination. The first table of the WIKI page is filled out by noon.

Participants

  • Tim Austin (JRC)
  • Ihssen Holger (EERA)
  • Theodoros Dimepoulos (AIT)
  • Karl Berger (AIT)
  • Manfred Paier (AIT)
  • Constantin V. Beck (HVL)
  • Mehran Ziaabadi (HVL)
  • Karolina Jąderko-Skubis (GIG)
  • Francesco Buonocore (ENEA)
  • Massimo Celino (ENEA)

Participants expertises

  • The AIT group is working mainly in the field of PV, both from the point of view materials (low toxicity, usability, ...) and from the point of view of systems and module (engineering, efficiency, quality, ...). AIT is leading the development of the EERAdata platform, for this reason it is joining the session to discuss the use case needs.
  • JRC is managing a large DB about structural materials. They pay a lot of attention to FAIR principles:
    • Access: not all the information in the DB are open. This helped private companies to provide and reuse data.
    • Interoperability: standardization procedures in collaboration with CEN.
    • Reusability: this is related to quality issues. Reviewers or commettees are often involved in the definition of the quality level.
  • Holger: he is working for a new project where data plays a fundamental role. It's about Materials Accelerated Platform (MAP). MAP has four components: modeling, robotics, experimental characterizations, artificial intelligence. There are already some working platforms in Europe and Canada, for example in the field of reduction of CO2. Other MAPs are about thermoelectricity, concrete and batteries (BIGMAP project in Germany).
  • GIG has expertise in deep water protection but in general on environmental engineering
  • HVL is here with expertises in COMET project (Constantin) and materials solution for wind (Mehran)
  • ENEA has expertise in Computational Materials Science

Draft list of databases

Name of database Description Reasoning of choice
NOMAD Data coming from molecular modeling It belongs to an HPC European Centre of Excellence
MATWEB MatWeb is a searchable database of engineering materials.The database includes thermoplastic and thermoset polymers, metals and alloys, ceramics, wood, fibers, stone, lubricants, inorganic salts, and other engineering materials. The database includes more than 100000 material data sheets Highly recognizable and popular database directly related to materials. Most data sheets of materials are provided by manufacturers / industry suppliers.
Urban Mine Platform The ProSUM project has developed the open-access Urban Mine Platform (UMP). This dedicated web portal is populated by a centralised database containing all readily available data on market inputs, stocks in use and hibernated, compositions and waste flows of electrical and electronic equipment (EEE), vehicles and batteries (BATT) for all EU 28 Member States plus Switzerland and Norway. The knowledge base is complemented with an extensive library of more than 800 source documents and databases. Platform provides the ability to view the metadata, methodologies, calculation steps and data constraints and limitations. Full access to the required data and information. Easy search. A rich set of metadata.
MatDB The European Commission JRC ODIN Portal hosts a number of scientific databases, one of which is MatDB, which is a database application designed to store mechanical test data coming from tests performed in accordance with mechanical testing standards. The metadata are organised into categories relevant to materials testing, so that there are main entities for source (i.e. provenance), material (i.e. production, heat treatment, microstructure, etc), specimen, test condition, documents and test result. The access management model supports both open access and restricted access. With a view to FAIR compliance, the database supports data citation (using the DataCite framework) and interoperability standards for test data (hosted at CEN). MatDB supports various features aligned to the FAIR data principles, including data citation e.g. Ruiz, A (2018): Nanoindentation (single cycle) test data for Gr. 91 material at 23 °C and maximum indenter force of 1.00332 mN, version 1.0, European Commission JRC (for findability and access), interoperability standards for mechanical test data and quality assurance wokflows.
FactSage Browser Thermochemical Data
NIST X-ray Photoelectron Spectroscopy (XPS) Database
AFLOWLIB AflowLIB (Automatic Flow LIB) is a software framework for high-throughput calculation of crystal structure properties of alloys, intermetallics and inorganic compounds. It is a rich database.
OQMD The Open Quantum Materials Database (OQMD) is a high-throughput database currently consisting of nearly 300,000 density functional theory (DFT) total energy calculations of compounds from the Inorganic Crystal Structure Database (ICSD) and decorations of commonly occurring crystal structures. It is a rich database.
Database of refractive indices Complex refractive index of inorganic and organic materials
Materials Project The Materials Project provides open web-based access to computed information on known and predicted materials as well as powerful analysis tools to inspire and design novel materials.
Organic Materials Database The organic materials database is an open access electronic structure database for 3-dimensional organic crystals, developed and hosted at the Nordic Institute for Theoretical Physics – Nordita. It provides tools for search queries based on data-mining and machine learning techniques.
Crystallography Open Database Open-access database of crystal structures of organic, inorganic and metal–organic compounds and minerals. It is a widely used and rich database for structural information on materials.
Photovoltaic Geographical Information System (PVGIS)

JRC Ispra: Photovoltaic Geographical Information System (PVGIS) providing free and open access to:

  • PV potential for different technologies and configurations of grid connected and stand alone systems.
  • Solar radiation and temperature, as monthly averages or daily profiles.
  • Full time series of hourly values of both solar radiation and PV performance.
  • Typical Meteorological Year data for nine climatic variables.
  • Maps, by country or region, of solar resource and PV potential ready to print.
  • PVMAPS software includes all the estimation models used in PVGIS
EU COST Action Pearl PV (CA16235) Photovoltaic Geographical Information System (PVGIS) Established to publish and upload data of monitored installed PV systems and to quantitatively evaluate the long-term performance and reliability of these PV systems in Europe and elsewhere

Discussion on the choice of databases:

  • We should select DB covering different lenght scales (from microscopic to devices) and different applications (PV, WIND, others..).
  • DBs should be select allowing interoperability among them, for example a DB on crystallography could be used with another that provides band structures and related properties.
  • NOMAD is a very interesting DB but it is only about Computational Materials Science at the atomic scale. It is mainly focussed on ab-initio calculations. However there is an interesting ongoing activities in FAIRification. However some very popular computational DBs are already included in NOMAD
  • MATWEB needs registration and it has a premium service. That's from engineers to engineers. Metadata are not rich.
  • URBAN MINE PLATFORM has rich metadata, however dowload is not that easy. No registration is needed. Metadata information can be found here

Notes from the afternoon session

The afternoon session is dedicated to discussing metadata for the selected databases. The aim of the afternoon session is to fill out table 2 of the wiki page for the use case and to decide what to report back from the use case to the plenary session the next day (again, see the WIKI template for a suggested structure). Thus, at the end of the day, the WIKI page for the use case is complete.

Details of FAIR criteria assessments

  • MatDB (thanks Tim!!)
    • F1. (meta)data are assigned a globally unique and eternally persistent identifier - yes, either a DataCite DOI or a (resolvable) database technical key;
    • F2. data are described with rich metadata - yes because the MatDB source entity extends to fields about the organisation, projects, data creators, etc.;
    • F3. (meta)data are registered or indexed in a searchable resource - yes, initially at DataCite, with records propagating to other metadata repositories such as the EU Open Data Portal, OpenAIRE and the JRC Data Catalogue;
    • F4. metadata specify the data identifier - yes in the circumstance that data are enabled for citation;
    • A1 (meta)data are retrievable by their identifier using a standardized communications protocol - ;
    • A1.1 the protocol is open, free, and universally implementable - ;
    • A1.2 the protocol allows for an authentication and authorization procedure, where necessary - ;
    • A2 metadata are accessible, even when the data are no longer available - ;
    • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation - ;
    • I2. (meta)data use vocabularies that follow FAIR principles - ;
    • I3. (meta)data include qualified references to other (meta)data - ;
    • R1. meta(data) have a plurality of accurate and relevant attributes - yes in the circumstance that data are enabled for citation because the metadata are compliant with the DataCite v.4 metadata schema;
    • R1.1. (meta)data are released with a clear and accessible data usage license - yes in the circumstance that data are enabled for citation because the DataCite v.4 metadata schema includes a 'rights' field;
    • R1.2. (meta)data are associated with their provenance - yes because the MatDB source entity includes fields about the organisation, projects, data creators, etc.;
    • R1.3. (meta)data meet domain-relevant community standards - yes in the circumstance that data are enabled for citation because the DataCite v.4 metadata schema is aligned to Dublin Core.

What to report back to the plenary on Day 3?

  • Databases selected (names and short reasoning)
  • Main insights from discussions:
    • Metadata are tightly linked with the expertise in the field.
    • there is not a unique definition of metadata: bibliography metadata (Tim) or everything is metadata (NOMAD) ??
    • It seems there is a lacking of standards
    • Tim was pointing out that interoperability is really an issue
    • Holger: The data that are available are increasing every day. This implies cost issues both to manage them and to analyze them. What can be done for this? Quality ?? political indications ?? standards ??
  • Suggested next steps
    • To have a deeper look at the metadata of the chosen DBs
    • To figure out if there are links with other use cases
    • To know EERA JPs activities to understand if materials are considered and which are the metadata of their interest
    • To analyze WIND metadata with HVL and discuss it with Annamaria Sempreviva (EERA JP Wind)
  • Other issues