WS1

From EERAdata Wiki
Revision as of 08:25, 4 June 2020 by August (talk | contribs) (Including new links)
Jump to: navigation, search

The first workshop of EERAdata is organized as an online webinar and hackathon from June, 2-4, 2020.

Objective

  • Learn about FAIR and open data practices in the energy and other research communities;
  • Choose selected databases to be investigated in use cases. Discuss and summarize the reasons for selection.
  • Analyze the state of FAIR/O principles for the selected database (with the help of the questionnaire and FAIR/O assessment tools).
  • Evaluate the state of FAIR/O principles for the metadata of selected databases.
  • Suggest general principles for the design of metadata (use-case- and domain-specific),
  • Discuss top-level metadata adhering to FAIR/O principles (use-case- and domain-specific),
  • Learn about the FAIRification of Data Management Plans (DMP).

Workshop concept of EERAdata

Read aheads

Obligatory:

Suggested read on the history of metadata: Metadata - Shaping Knowledge from Antiquity to the Semantic Web by Richard Gartner, https://www.springer.com/gp/book/9783319408910

Join and collaborate interactively!

The workshop is a hackathon! EERAdata is evolving jointly only! Join the discussions and work together online:

EERAdata story board

Objective: collect views from participants regarding energy data, energy metadata, FAIR/O problems etc. to support discussions, identification of gaps & needs (WP2) and elements to design the platform (WP3).

Questions:

  • Q#1: What is your user story regarding FAIR/O energy data?
  • Q#2: What is your user story regarding metadata?

Template to answer: As a <stakeholder>, I want <goal> so that <reason>.

How to post a story?

  1. Click on the link: https://padlet.com/janaschwanitz/aciywzt9fs1h70m2
  2. Click on plus at the right bottom to add a story.
  3. Choose as title depending on the question one wants to answer. Choose for Q#1: FAIR/O data, or for Q#2: metadata. In the example above, the title “metadata” was chosen.
  4. Type your sentence to let us know where the gaps and hopes are: The template is: As a <stakeholder>, I want <goal> so that <reason>.

Example: As a data-driven energy researcher, I want utopia - i.e. one-stop access to all relevant metadata -, so that I can browse available data, check where they are from and whether I can trust them when reusing.

How to comment a story?

  • One can rate stories by clicking on a number of stars.
  • One can comment stories as well as comments with words.
  • It is anonymous commenting and adding of stories, as long as one does not login into one's personal profile at padlet.com.

EERAdata wiki - How to?

The easiest way is you simply start writing and editing. You can install an easy editor by doing what is described in the picture to the right.

How to install an easy editor

Consult the User's Guide for information on using the wiki software.

EERAdata on github

Link here and share your thoughts and issues! Note, work in progress.

EERAdata @ Research Gate

Link here and share your thoughts and issues! Note, work in progress.

Just have fun!

Metadata memory game

Read before playing: This is a little memory game where players need to identify triples, that is three tiles that belong to one data set. All of them relate to low carbon energy. Some tiles have the form of a picture, others show metadata descriptions (in various formats) and some are even sound files. So, the game is to have fun, while learning about metadata. Try out if you can find all triples. Note if you have opened 3 tiles and they do not match, all will automatically close. Just as you know it from a real-life memory game. Link

Agenda and notes Day 1

Builds on read aheads. Online talks and discussions. Space for interaction with participants after each presentation. Moderated discussion of collected comments.

Time slot Topic
10.00-10.20 Welcome and introduction with workshop goals and procedures: “EERAdata - Towards Utopia for low carbon energy research”, Valeria Jana Schwanitz, HVL & PI EERAdata. Link: [1]. Main points:
  • In the energy system the data revolution offers prospects, but we are fare away from harvesting them. The time researchers spent with data governance (finding, cleaning, revising of formats) and bureaucracy exceeds time spent on exciting stuff (thinking, creating new insights, collaborating, and discussing with others) by far.
  • There is a lack of joint standards and common metadata formats to support researchers in finding and reusing heterogeneous data. Machine-actionability is also an issue.
  • The vision of developing a one-stop-entry point for energy research is clear before our eyes: being able to search for data, access rich metadata, choose & select datasets, crunch data real-time online, being able to link to "My-researcher-space", a platform that offers a personalized workspace for data analysis, paper writing, and research collaboration.
10.20-12.15 Online lectures:

The EOSC Nordic: machine-actionable FAIR maturity evaluations & the FAIRification of data repositories” - Andreas Jaunsen, Nordforsk & EOSC-Nordic. Link: [2]. Main points:

  • Goal and vision of EOSC: Enable researchers to access data across domains and disciplines as easy as possible, support locating the relevant data. All European data should be available to researchers, but this does not mean that all data is stored centrally. Instead, databases should be interconnected.
  • FAIR maturity evaluations: Close to 100 repositories have been tested using an automated tool. Most of them only score with 0.10 out of 1.
  • Links from the presentation & chat comments: <to do>

OpenAIRE: Open Access Infrastructure for Research in Europe”, Ilaria Fava, OpenAIRE. Link: [3]. Main points:

  • Bulleted list item
  • 2
  • Links

A short break of 15 min -

Community-driven metadata and ontologies for Materials Science and their key role in artificial-intelligence tools”, Luca Ghiringhelli, FHI Berlin. Link: [4]. Main points:

  • Bulleted list item
  • 2
  • Links

Metadata practices from IRP Wind”, Anna Maria Sempreviva, DTU. Link: [5]. Main points:

  • Bulleted list item
  • 2
  • Links
12.15-13.00 Lunch Break. Play the EERAdata game “Utopia and metadata”. Or any time.
13.00-14.00 Online lectures:

Humanities and data: for a community-driven path towards FAIRness”, Elena Giglia, UNITO. Re-using presentation held at the Open Science Conference 2020 in Berlin. Presentation stored at zenodo. Link: [6]. Main points:

  • Bulleted list item
  • 2
  • Links

RISIS - An e-Infrastructure for the STI-Policy research community”, Thomas Scherngell, AIT. Link: [7]. Main points:

  • Bulleted list item
  • 2
  • Links
14.00-14.30 Break - Game “Utopia and metadata”. Or any time.
14.30-16.00 Discussion to compile a to-do list for work in use cases on the second day. Serves as a guiding and aligning process. Lead by WP2, August Wierling/Valeria, HVL. What is the take-to-day-2 message for your use cases?
  • Use case 1: The main issue for us is re-usability. We need to assess the databases that were chosen previously. Learn from best practices.
  • Use Case 2: Our main issue is privacy/ sensitivity of data. Security should come first. There is a tradeoff between universal metadata language and domain-specific language.
  • Use case 3: Linguistics is a problem. As soon as we change the application of the material, we also change related metadata. Find similarities of already existing metadata.
  • Use case 4: We face different languages and terminologies. The range of interpretation of the same terms is broad. We should address low hanging fruits but also aim at cracking hard nuts to improve FAIR/O principles.
  • General: EERAData is probably more about asking the right questions to the energy research community than providing the right answers. There are already a lot of answers out there, we need to link them to our data issues. We envision being able to suggest low carbon energy metadata standards for and with the research community. Colleagues working on the EERAdata platform will join in on all use case discussions on day 2.
  • Motto: "Ontology is a formal representation of the matured knowledge of a community on a specific purpose".

Agenda Day 2

News of the day before

Not energy, but an inspiring connection: Perhaps unexpected, or perhaps even not: DNA helps to puzzle pieces of Qumran role. DNA taken from animal skins that were used to write on ...

Perhaps unexpected, or perhaps even not: DNA helps to puzzle pieces of Qumran role. DNA taken from animal skins that were used to write on ...

FAIR data and FAIR METADATA - DISCUSSIONS in use cases, parallel sessions

Work in use cases on databases and metadata, led by use case leaders. Suggested outline:

  • 10-12 Discuss and update the preliminary state of FAIR/O for the use case. Use the prepared draft of databases to check compliance with FAIR principles (tools: WP3 questionnaire and others). Compare the assessment results for each database. Observe and discuss agreements and differences across the evaluation tools. Generate the overall picture for FAIR/O compliance for the use case to pin down the state of art. Let’s see if we come to the same result as in our initial assessment for the application (traffic lights). Continuously make notes to report later on results. Select a responsible person. Objective: select 3-5 databases per use case. Discuss which databases to select. One to cover use-case specific challenges; and one with cross-use case relevance, and one a low hanging fruit for which it would be relatively easy to improve the current FAIR/O status.
  • 13-15 Joint brainstorming to discuss FAIR/O state of the metadata for the selected use cases. Evaluate: What is the current description of metadata? How extensive are they? Is only administrative information provided? Or richer context description? What frameworks for metadata are used: taxonomy? thesaurus? ontology? How is the metadata information technically implemented: plain text file? xml? rdf? ... Identify use case specific issues with metadata - What are the gaps? What is perceived as a hard nut to crack? Pay special attention to the metadata of the databases and fill out the table provided WP2. Continuously make notes to report results the next day! Select a responsible person!
  • from 15 Joint recordings of lessons learned. Create and/or update the WIKI for the use case with literature, gaps, best practices, FAIR/O discussion, metadata discussion, suggested next steps, .... Get your head around what to report next day! Plan for 20 min. See the links to WIKI page templates below (Notes from Day 2).

Issues identified across use cases

Use case 1 Use case 2 Use case 3 Use case 4
Buildings efficiency Power transmission & distribution networks Material solutions for low carbon energy Low carbon energy and energy efficiency policies
UC1 logo.jpg
UC 2 logo.jpg
UC 3 logo.jpg
UC 4 logo.jpg
Gaps & challenges per use case in a nutshell
  • qualitative nature of data limiting interoperability
  • data availability issues for time-series
  • multiplicity and scattered nature of data sources (households, industries, utility companies, municipalities)
  • Lack of standardization for metadata taxonomy and common vocabulary
  • Ambiguity on licensing issues for various types of energy data
  • Lack of unique identifier for energy data in most databases
  • microscopic use cases resembling the existing one about PV
  • link between microscopic and macroscopic materials (e.g., turbine blades)
  • metadata for applications of materials
  • linking to other fields
  • heterogeneous data make standardization difficult
  • policies are a topic linked to all use cases in EERAdata
  • metadata for images (e.g., maps) underdeveloped
  • complexity of complete provenance information
  • language and terminology are an issue (e.g., records instead of data)
  • stark discrepancy between FAIR assessment results by humans and machines

Use case 1: Continuously updated summary page UC1,

  • Detailed notes from WS1: WS1UC1
  • Detailed notes from WS2: WS2UC1
  • Detailed notes from WS3: WS3UC1
  • Detailed notes from WS4: WS4UC1
  • Detailed notes from WS5: WS5UC1
  • Detailed notes from WS6: WS6UC1

Use case 2: Continuously updated summary page UC2,

  • Detailed notes from WS1: WS1UC2
  • Detailed notes from WS2: WS2UC2
  • Detailed notes from WS3: WS3UC2
  • Detailed notes from WS4: WS4UC2
  • Detailed notes from WS5: WS5UC2
  • Detailed notes from WS6: WS6UC2

Use case 3: Continuously updated summary page UC3,

  • Detailed notes from WS1: WS1UC3
  • Detailed notes from WS2: WS2UC3
  • Detailed notes from WS3: WS3UC3
  • Detailed notes from WS4: WS4UC3
  • Detailed notes from WS5: WS5UC3
  • Detailed notes from WS6: WS6UC3

Use case 4: Continuously updated summary page UC4,

  • Detailed notes from WS1: WS1UC4
  • Detailed notes from WS2: WS2UC4
  • Detailed notes from WS3: WS3UC4
  • Detailed notes from WS4: WS4UC4
  • Detailed notes from WS5: WS5UC4
  • Detailed notes from WS6: WS6UC4

Note: This schedule is a suggestion. Adjust and organize breaks as needed.

Special session - FAIR data evaluation by machine

15.30 - 16.00 Andreas Jaunsen, Nordforsk. Presentation of the results from an automated FAIR evaluation for selected repositories suggested by the use cases.

The proposed list of databases to check:

Results Note: 22 indicators with the FAIR Maturity Evaluation Service are tested. 0 stands for "failing the test", 1 stands for "standing the test".

Most of them are a mix of web-pages (of repositories). Thus, a random dataset from a few of them was selected.

Database link Result link Result across 22 indicators Aggregate result across FAIR categories
https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=OJ:JOC_1999_093_R_0001_01 https://fair.etais.ee/evaluation/4842 1001100001010110000000 37.50% 40.00% 28.57% 0.00% 31.82%
https://data.jrc.ec.europa.eu/dataset/93d07f10-7757-485f-bb8e-3160536b97f8 https://fair.etais.ee/evaluation/4843 1001110011110110011100 0.00% 80.00% 71.43% 0.00% 59.09%
https://onlinelibrary.wiley.com/doi/abs/10.1002/aenm.201902830 https://fair.etais.ee/evaluation/4844 1000000001010000000000 12.50% 40.00% 0.00% 0.00% 13.64%
https://doi.org/10.25832/conventional_power_plants/2018-12-20 https://fair.etais.ee/evaluation/4845 1101110011110110011100 62.50% 80.00% 71.43% 0.00% 63.64%

See also: https://www.rd-alliance.org/groups/fair-data-maturity-model-wg and https://fairsharing.github.io/FAIR-Evaluator-FrontEnd/#!/

Agenda and notes from Day 3

Reporting experience from use case applications. Prepare the next workshop on workflows and metadata. Introduction and discussion of data management plans.

Time slot Topic
10-12.00 Discussion lead by WP2. Report from use case experiences (by use case leaders). Wrap up.
12-12.30 Presentation "Some pitfalls in data base licenses", Carsten Hoyer-Klick, DLR - German Aerospace Center. Main points:
  • Data bases can be proteced by general copyright law or data base generation right or both
  • Data base generation right just need substantial invesment
  • If no license is applied to the data base, general copyright and data base generation right apply, which are very restrictive
  • Data bases should be published with suitable persmissive licenses (e.g. CC-BY)
  • https://open-power-system-data.org/legal
12.30-13.30 Lunch break
13.30-15.00 Data Management Plans

Presentation “Introduction to DMP and best practices” by Trond Kvamme, NSD. Link: [8] Main points:

  • Bulleted list item
  • 2
  • Links

Presentation on Machine-actionable DMPs by Tomasz Miksa, TU Vienna. Link: [9] Main points:

  • Bulleted list item
  • 2
  • Links

Discussion of EERAdata DMP draft (August, HVL). Main points:

  • Bulleted list item
  • 2
  • Links
  • Short break of 15 min
15.15-16.00 Wrap up of workshop with feedback from invited experts.

Notes from Day 3

  • Bulleted list item

WS1 through the lense of an artist

Day 1 presentations