WS1

From EERAdata Wiki
Revision as of 10:38, 2 June 2020 by August (talk | contribs)
Jump to: navigation, search

The first workshop of EERAdata is organized as an online webinar and hackathon from June, 2-4, 2020.

Objective

  • Learn about FAIR and open data practices in the energy and other research communities;
  • Choose selected databases to be investigated in use cases. Discuss and summarize the reasons for selection.
  • Analyze the state of FAIR/O principles for the selected database (with the help of the questionnaire and FAIR/O assessment tools).
  • Evaluate the state of FAIR/O principles for the metadata of selected databases.
  • Suggest general principles for the design of metadata (use-case- and domain-specific),
  • Discuss top-level metadata adhering to FAIR/O principles (use-case- and domain-specific),
  • Learn about the FAIRification of Data Management Plans (DMP).

Read aheads

Obligatory:

Suggested read on the history of metadata: Metadata - Shaping Knowledge from Antiquity to the Semantic Web by Richard Gartner, https://www.springer.com/gp/book/9783319408910

Join and collaborate interactively!

The workshop is a hackathon! EERAdata is evolving jointly only! Join the discussions and work together online:


EERAdata story board

Objective: collect views from participants regarding energy data, energy metadata, FAIR/O problems etc. to support discussions, identification of gaps & needs (WP2) and elements to design the platform (WP3).

Questions:

  • Q#1: What is your user story regarding FAIR/O energy data?
  • Q#2: What is your user story regarding metadata?

Template to answer: As a <stakeholder>, I want <goal> so that <reason>.

How to post a story?

  1. Click on the link: https://padlet.com/janaschwanitz/aciywzt9fs1h70m2
  2. Click on plus at the right bottom to add a story.
  3. Choose as title depending on the question one wants to answer. Choose for Q#1: FAIR/O data, or for Q#2: metadata. In the example above, the title “metadata” was chosen.
  4. Type your sentence to let us know where the gaps and hopes are: The template is: As a <stakeholder>, I want <goal> so that <reason>.

Example: As a data-driven energy researcher, I want utopia - i.e. one-stop access to all relevant metadata -, so that I can browse available data, check where they are from and whether I can trust them when reusing.

How to comment a story?

  • One can rate stories by clicking on a number of stars.
  • One can comment stories as well as comments with words.
  • It is anonymous commenting and adding of stories, as long as one does not login into one's personal profile at padlet.com.

EERAdata wiki - How to?

The easiest way is you simply start writing and editing. You can install an easy editor by doing what is described in the picture to the right.

How to install an easy editor

Consult the User's Guide for information on using the wiki software.

EERAdata on github

Link here and share your thoughts and issues! Note, work in progress.

EERAdata @ Research Gate

Link here and share your thoughts and issues! Note, work in progress.

Just have fun!

Metadata memory game

Read before playing:

This is a little memory game where players need to identify triples, that is three tiles that belong to one data set. All of them relate to low carbon energy. Some tiles have the form of a picture, others show metadata descriptions (in various formats) and some are even sound files. So, the game is to have fun, while learning about metadata. Try out if you can find all triples. Note if you have opened 3 tiles and they do not match, all will automatically close. Just as you know it from a real-life memory game. Link

Agenda Day 1

Builds on read aheads. Online talks and discussions. Space for interaction with participants after each presentation. Moderated discussion of collected comments.

Time slot Topic
10.00-10.20 Welcome and introduction with workshop goals and procedures: “EERAdata - Towards Utopia for low carbon energy research”, Valeria Jana Schwanitz, HVL & PI EERAdata. Link: [1]
10.20-12.15 Online lectures:
  • “The EOSC Nordic: machine-actionable FAIR maturity evaluations & the FAIRification of data repositories” - Andreas Jaunsen, Nordforsk & EOSC-Nordic. Link: [2]
  • “OpenAIRE: Open Access Infrastructure for Research in Europe”, Ilaria Fava, OpenAIRE. Link: [3]

A short break of 15 min -

  • “Community-driven metadata and ontologies for Materials Science and their key role in artificial-intelligence tools”, Luca Ghiringhelli, FHI Berlin. Link: [4]
  • “Metadata practices from IRP Wind”, Anna Maria Sempreviva, DTU. Link: [5]
12.15-13.00 Lunch Break. Play the EERAdata game “Utopia and metadata”. Or any time.
13.00-14.00 Online lectures:
  • “Humanities and data: for a community-driven path towards FAIRness”, Elena Giglia, UNITO.
  • “RISIS - An e-Infrastructure for the STI-Policy research community”, Thomas Scherngell, AIT. Link: [6]
14.00-14.30 Break - Game “Utopia and metadata”. Or any time.
14.30-16.00 Discussion to compile a to-do list for work in use cases on the second day. Serves as a guiding and aligning process. Lead by WP2, August Wierling/Valeria, HVL.

Notes from Day 1

Welcome and Introduction

  • Introduction by Valeria:
    • What is the project EERAdata about
    • In the energy system the data revolution speeds up
    • Data on:
      • Wheather patterns
    • more and more application are becomign available
      • digital twins: steering nuclear plants just from the scree
    • Buildings get energy passports
    • era of digital reporting and compliance arises
    • Importance of linking across different formats and topics
    • time spent as aresearcher: Large amount of time spent on data sorting, formatting, searching, etc
  • Valeria invites everyone again to write down their story connected to energy data on the EERAdata storyboard (see Link on Main page WIKI)
    • This will be valuable in moving forward with the ideas for the EERAdata project
  • Valerias personal starting point regarding EERAData:
    • Work in towards enabling FAIR and open data for the low carbon energy research community
  • We hope to add what we learn in this 3 days to the EERA WIKI
    • guidance platform for low carbon energy researchers
  • at the end we hope we come to a set of meta data standards for energy data
  • what could be the utopia for energy data researcher:
    • emagine lookign for citizen led initiatives in PV
    • search for keyworkds on a website -> and recieve a list of datasets related to this topic
    • for all those datasets:
      • see amount of initiatives available
      • when where they updated the last time
      • links provided to access the data
      • are the certified as FAIR/O and who is certifying institution
      • refine search with specific filter options
      • information on the quality/ data collection and processing methodology
      • option for simple data analysis operations
      • -> get an overview over richness of data
    • -> we are still very far away from this utopia
  • Goals of this Workshop:
    • listen to various best practices
    • day 2: discuss the provided examples
    • day 3: presentation of findings from day 2 discussions
    • day 3 afternoon: Data management plans (DMPs)

The EOSC Nordic: machine-actionable FAIR maturity evaluations & the FAIRification of data repositories

  • notes incoplete, will be added later
    • What is the EOSC:
      • goal:
        • enable reserachers to access data across domains and disciplines as easy as possible
        • locating the relevant data
      • vision:
        • all european data shoud be available to research
        • does not mean that all data is stored centrally, but that all databases are interconnected
  • WP 4 members: include all nordic countries
  • How can we improve datasets:
    • source code:
  • recommendations:
  • evaluations:
    • why do we evaluate repositories?
      • what is the level of FAIRness?
      • about 10 datasets per repository
      • FAIR score
      • Positive highlights:
      • FAIR score of all evaluated datasets:
        • majority at 0.10 out of 1
  • coming tasks:
  • question from Valeria: would it be possible to use their scoring system to assess one of the EERAdata datasets?
    • Answer:

OpenAIRE: Open Access Infrastructure for Research in Europe

  • Introduction into OPENAIRE:
    • 2007 pilot publications
  • Openaire is bridging the 2 worlds where science is performed and published
  • In practice:
    • services that monitor, assess and accelerate open science
    • facilitate research communities adoption of open science
  • who is openaire:
    • 50 partners all over europe
    • experts in open science in every EU member states
    • regional coordinators
    • topical coordinators working on policies for open science
    • openaire is an infrastucture
    • last phase of openaire: citizen science
      • openaire end in 12/2020
    • 4 regions: North-, east-, west-, south europe
    • large diferences in needs from country to country
      • -> find best solutions to develop open science for each country
    • connections to open science community in North america, Japan, ...
  • openaire has published set of guidelines in cooperation with international network:
  • provided services:
    • providing policy advice
    • training and support
    • open science infrastructure
    • How to achieve FAIRness in sharing resreach data?
    • guidelines on how to provide open access to publications, not only data
  • How we support:
    • Helpdesk: questions, FAQ
  • Openaire provides guides for:
    • researchers
    • content providers
    • funders
  • Webinars:
  • outreach:
    • 230 webinars
    • 22 NOADS are involved in EOSC WGs
  • explore portal:
    • search interface for all content available through Openaire
      • 40M publications: shows most crucial metadata
    • EERADATA record:
      • for now, no information, as no data has been collected from repositories yet
  • Openaire connect project:
    • gateway for reserach communities
    • allows research communities to build a gateway that collects different data relevant to your research community
  • COVID-19 research community:
    • ca 3200 research datasets
    • 113 relates projects
  • OpenAIRE provide:
    • allows connection of repositories to openAIRE
  • Zenodo: repository developed by openaire that allows institutions to publish their reserach output
  • AGROS: machines readable data repository
  • comment from Andreas Jaunsen: today, research projects spent roughly 80% of their research time on data gathereing/ processing
    • -> open access aims to reduce this by only having to precess the data once
  • Carsten Hoyer comments:
    • create a metadata catalogue
    • link the provenance information of data
      • -> track who has done what with the data
  • Question from August:
  • Rich metadata: search specifically for the research question for one on paper? How could this be achieved?
    • openaire explore: works as normal search engine -> high "noise" (unwanted search results)
  • incentives for researchers to provide rich metadata?
    • Use of commen ontology allows for more refined serach results

Community-driven metadata and ontologies for Materials Science and their key role in artificial-intelligence tools

  • Key message:
  • FAIR: Findable, Accessible, Interoperable, Reusable
    • Findable:
      • Uniqueness of the date
    • accessible:
      • URL, accessible vi API
    • Interoperability:
    • reusability: Metadata should be as desciptive as possible
  • NOMAD-FAIRDI Workshop:
    • Shared metadata and data formats for big-data driven material science
  • Data object:
    • Metadata:
      • Unique identifier, Structure of the data, Method
      • should contain information of teh full provenance of the data:
        • where does it come from? Another database? A calculation? etc?
      • Definition:
  • NOMAD Respository structure:
    • Nomad Repository
    • Conversion Layer
    • The archive
      • Three access points:
  • Computational material sciences:
  • Ontology:
  • Questions:
    • what happens to metadate when you physically go from one structure to a combined structure? How do you combine the metadate of the two initial structures?
    • from EERAData perspective: in material for low carbon energy use case, what could be valuable contribution of EERAData to reserach community?
      • Ontologies are driven by the use cases.
      • summarizing: Excersizing a fine grain use case such as solar PV. Linking different levels of metadata. Further develop metadata at the use case level. How to link across use cases?

Metadata practices from IRP Wind

  • Bulleted list item

Humanities and data: for a community-driven path towards FAIRness

  • Bulleted list item

RISIS - An e-Infrastructure for the STI-Policy research community

  • Bulleted list item

Discussion/ to-do list for work in use cases

  • Bulleted list item

Agenda Day 2

METADATA - DISCUSSIONS in use cases, parallel sessions. Work in use cases on databases and metadata, led by use case leaders. Suggested outline:

  • 10-12 Discuss and update preliminary state of FAIR/O for the use case. Use the prepared draft of databases to check compliance with FAIR principles (tools: WP3 questionnaire and others). Compare the assessment results for each database. Observe and discuss agreements and differences across the evaluation tools. Generate the overall picture for FAIR/O compliance for the use case to pin down the state of art. Let’s see if we come to the same result as in our initial assessment for the application (traffic lights). Continuously make notes to report later on results. Select a responsible person. Objective: select 3-5 databases per use case. Discuss which databases to select. One to cover use-case specific challenges; and one with cross-use case relevance, and one a low hanging fruit for which it would be relatively easy to improve the current FAIR/O status.
  • 13-15 Joint brainstorming to discuss FAIR/O state of the metadata for the selected use cases. Evaluate: What is the current description of metadata? How extensive are they? Is only administrative information provided? Or richer context description? What frameworks for metadata are used: taxonomy? thesaurus? ontology? How is the metadata information technically implemented: plain text file? xml? rdf? ... Identify use case specific issues with metadata - What are the gaps? What is perceived as a hard nut to crack? Pay special attention to the metadata of the databases and fill out the table provided WP2. Continuously make notes to report results the next day! Select a responsible person!
  • 15-17 Joint recording of lessons learned. Create and/or update the WIKI for the use case with literature, gaps, best practices, FAIR/O discussion, metadata discussion, suggested next steps, .... Get your head around what to report next day! Plan for 20 min. See the links to WIKI page templates below (Notes from Day 2).

Note: This schedule is a suggestion. Adjust and organize breaks as needed.

Notes from Day 2

  • Use case 1 - Summary: UC1, Detailed notes: WS1UC1
  • Use case 2 - Summary: UC2, Detailed notes: WS1UC2
  • Use case 3 - Summary: UC3, Detailed notes: WS1UC3
  • Use case 4 - Summary: UC4, Detailed notes: WS1UC4

Agenda Day 3

Reporting experience from use case applications. Prepare the next workshop on workflows and metadata. Introduction and discussion of data management plans.

Time slot Topic
10-12.30 Discussion lead by WP2. Report from use case experiences (by use case leaders). Wrap up.
12.30-13.30 Lunch break
13.30-15.00 Data Management Plans
  • Presentation “Introduction to DMP and best practices” by Trond Kvamme, NSD
  • Presentation on machine-actionable DMPs by Tomasz Miksa, TU Vienna
  • Discussion of EERAdata DMP draft (August, HVL)
  • Short break of 15 min
15.15-16.00 Wrap up of workshop with feedback from invited experts.

Notes from Day 3

  • Bulleted list item