WS1UC4

From EERAdata Wiki
Revision as of 07:45, 4 June 2020 by Manfred (talk | contribs) (Notes from the morning session)
Jump to: navigation, search

This page provides space for notes from workshop discussions of use case 4 during Day 2 of WS1.

Interactive elements

The discussions can also benefit from considering the EERAdata storyboard and/or slido comments. See WS1#Interactive_elements.

Notes from the morning session

The morning session is dedicated to database discussions with the aim to select 3-5 databases from below's list of databases for further examination. The first table of the WIKI page is filled out by noon.

  • The workshop has started with the short introduction round to enable people to know something about each other, especially of the specific institutions day-to-day approach to data use.
  • Later on the base for further works was presented:
  1. screeening the databases --> names, short descriptions, rationale for preliminary choosing the database,
  2. checking the databases proposed already, adding new, if needed,
  3. discussion and evaluation of databases - to choose the most suitable one for further works.


Segmenting databases into 2 categories:

  1. policy relevant databases, eg. COMETS, EU Merci, PETA4,
  2. policy databases, eg. IEA Policy database, EUR-Lex, OECD, RES-Legal, MURE.

From both categories, the most suitable databases may be chosen. This will enable us to have a full spectrum - how energy policies may use both types of databases.

Chosen databases - after morning discussion:

  1. COMETS,
  2. EUR-Lex,
  3. IEA Policy database,
  4. JRC.

The evaluation with the use of a tool developed by Mark Wilkinson will be carried out on chosen databases during afternoon session. In this time will be also assesing databases has already chosen by metadata aspects.

List of participants

Draft list of databases

The databases that had been identified as containing data in relation to low carbon energy and energy efficiency policy:

  1. Copernicus
  2. DataCite
  3. ECO data set
  4. EnergyData
  5. EREK
  6. EU Merci
  7. EUR-Lex
  8. European Buildings Stock Observatory
  9. IEA Policies database
  10. IRENA, REmap
  11. IRENA, Resource
  12. JRC-database hub
  13. JRC-IDEES: Integrated Database of the European Energy Sector: Metodological note
  14. [https://ec.europa.eu/jrc/en/publication/eur-scientific-and-technical-research-reports/monitoring-ri-low-carbon-energy-technologies JRC: Monitoring R&I in Low-Carbon Energy Technologies
  15. MAGIC
  16. NREL data
  17. ODIN: Online Data & Information Network for Energy
  18. OECD data
  19. OEIL
  20. OPENEI
  21. PETA4
  22. PROQuest
  23. REEEP
  24. Re3data.org
  25. RES LEGAL
  26. SETIS research and innovation data
  27. ZENODO
  28. H2020 project COMETS - EU-wide inventory of collective action initiatives in the low carbon energy transition
  29. [1] - EU taxonomy for green innovation companies
  30. MURE (Mesures d'Utilisation Rationnelle de l'Energie)database [2].

Discussion on the choice of databases:

  • From GIG, apart from IEA Policy database, there will be also a proposition of EUR-Lex, DataCite, EU Merci, OECD, and probably also JRC-database.
  • @ IUE and ENEA: each of you please suggest 2 other databases and prepare for them a description and an explanation - why we should choose them for further evaluation. Material please put in the Wiki.
  • @ all: for databases that you suggested please complete the questionnaire from AIT in OO ([3] and put answers into the folder in OO [4] until Friday 29.05.

COMETS database: EU-wide inventory of collective action initiatives in the energy transition

Description of database: The inventory is a multi-country database of collective action initiatives (e.g., energy cooperatives, eco-villages, community energy projects, and other citizen-led energy projects). For each initiative, the country, the date of foundation/cancellation, the evolution of members and renewable energy units, installation capacities, fields of engagement, legal form, website, address, source, etc. is collected. The entries are sourced from a variety of national, regional, and European registers and websites.

Rationale for choosing this database:

  • The database is of policy relevance. Its data interpretation is highly dependent on changes in the legal and policy frameworks across Europe.
  • We own the database and therefore have it easier to exercise FAIRification and opening activities.
  • The database is being created in a parallel running European project and we would not like that the database ends in the graveyard of dead EU project databases.
  • We aim at opening the database to the low carbon research community within the next two years.
  • We are interested to explore the options for opening while ensuring that our work in building the database is appropriately acknowledged. This is relevant for later EERAdata discussion. Here, we already started discussing with e3s and tJP of EERAdata.
  • We can use synergies with another parallel EU project and have the possibility to assign rich metadata to every single entry, by also using the expertise of researchers from the COMETS project.
  • The database itself has a relatively simple structure, however, the inventory contains heterogeneous and dynamic data. Thus, it helps us to investigate how to approach FAIRification for such data. This is a particular challenge for UC4.
  • Sources of the COMETS database are coming in a variety of European languages, all brought together in an English database. Thus, it helps us to investigate the relevance of languages for FAIRification and opening activities. This is also a particular challenge for UC4.
  • We aim at uploading this database as an example for EERAdata to EOSC. We hope to create a best practice example for a FAIR and open database. Currently, it is F - non-existing, A - non-existing., I - partly as one can, e.g., go to the currently published papers to access the original sources. But it is not machine-actionable. R - no, because it is currently not open.

Reference: Wierling, A.; Schwanitz, V.J.; Zeiß, J.P.; Bout, C.; Candelise, C.; Gilcrease, W.; Gregg, J.S. Statistical Evidence on the Role of Energy Cooperatives for the Energy Transition in European Countries. Sustainability.

IEA Policy database

Description of the database: A worldwide database of past, future and planned policies and measures thematically related to reduce greenhouse gas emissions, improve energy efficiency and support the development and deployment of renewables and other clean energy technologies. It contains data from IEA/IRENA Renewable Energy Policies and Measures Database, the IEA Energy Efficiency Database, the Addressing Climate Change database, and the Building Energy Efficiency Policies (BEEP) database, along with information on CCUS and methane abatement policies. Data are collected from governments, partner organizations and IEA analysis since 1999.

Rationale for choosing this database:

  • A popular, recognized, and world-wide database hosted by a reputable institution.
  • Since we are not able to analyze all the databases that we would like to - the selection of the IEA Policy database would indirectly allow us to take into account several other, selected databases because it contains data from: IEA/IRENA Renewable Energy Policies and Measures Database, the IEA Energy Efficiency Database, the Addressing Climate Change database, and the Building Energy Efficiency Policies (BEEP) database.
  • Cross-use case relevance, as it contains policies and measures also for buildings efficiency (UC1) and for power transmission and distribution networks (UC2).

EU Taxonomy for green innovation companies

Description of the database: The EU Taxonomy is a tool to help investors, companies, issuers, and project promoters navigate the transition to a low-carbon, resilient and resource-efficient economy. The Taxonomy sets performance thresholds (referred to as ‘technical screening criteria’) for economic activities which: • make a substantive contribution to one of six environmental objectives (Figure 1); • do no significant harm (DNSH) to the other five, where relevant; • meet minimum safeguards (e.g., OECD Guidelines on Multinational Enterprises and the UN Guiding Principles on Business and Human Rights). The performance thresholds will help companies, project promoters and issuers access green financing to improve their environmental performance, as well as helping to identify which activities are already environmentally friendly. In doing so, it will help to grow low-carbon sectors and decarbonise high-carbon ones. The EU Taxonomy is one of the most significant developments in sustainable finance and will have wide-ranging implications for investors and issuers working in the EU, and beyond.

Rationale for choosing this database:

  • It is low-hanging fruit because taxonomy is already in place.
  • We could use this taxonomy to see how it aligns with our other assessments (and with other use cases).
  • The taxonomy will be enacted in a year from now, so we can provide timely feedback.

Reference: [[5]]

EUR-Lex

Description of the database: EUR-Lex is online database that provides the official and most comprehensive access to EU Law and legal documents such as treaties, legal acts from EU institutions, preparatory documents related to EU legislation, EU case-law, international agreements, EFTA documents, references to national case-law related to EU law. EUR-Lex has access to all editions of the Official Journal of the European Union (OJ) since the first of December 1952.

Rationale for choosing this database:

  • It is available in all of the EU’s 24 official languages.
  • It is updated daily, which confirms the validity of the documents.
  • Each document in EUR-Lex is supported by detailed information such as: relations with other legal documents, case-law interpretations, dates of adoption or entry into force etc.
  • It is the open and highly recognizable European Union’s database directly related to policy.
  • Most documents in EUR-Lex, regardless of their language, receive a unique identifier - CELEX number, so all documents have a designed structure and each type of document corresponds to a so-called “a descriptor”.

EU MERCI

Description of the database: The database contains data on the implementation of energy efficiency measures in the manufacturing industry. For nearly 3000 companies from 4 European countries (Austria, Italy, Poland, United Kingdom), the following information has been collected: size of the company, year of implementation of the measure, type of support received (white certificates), statistical classification of economic activities, according to NACE, applied processes and ways to save energy and total final energy savings achieved by the implemented measures. The database contains data also on low-carbon policies, examples of business action to reduce emissions, etc. The database includes data for the 2005-2017 period. Besides the database, also a library, sector reports and static analyses of different countries were made available.

Rationale for choosing this database:

  • The database contains good practices that can be used by other actors willing to implement energy efficiency measures.
  • Disadvantages of the database: It includes data from the 2005-2017 period, with no updates (no newer examples).

OECD database

Description of the database:  OECD database makes available data related to agriculture, development, economy, education, energy, environment, finance, government, health, innovation and technology, jobs and society. In each of these areas the database contains, inter alia, data on legal issues. In the area of energy, data can be assigned thematically to: primary energy supply, crude oil production, electricity generation, renewable energy, nuclear power plants, crude oil import prices. The data comes from over 130 countries around the world. The database allows sharing data in the form of indicators, maps and tables. The data is collected and made available since 1960.

Rationale for choosing this database:

  • It is a popular, well recognised worldwide database, maintained by a reputable institution.
  • Since we are not able to analyze all the databases that we would like to - the selection of the OECD database would indirectly allow us to take into account several other, selected databases because it contains data from: the IEA database - International Energy Agency (https://www.iea.org/) and Nuclear Energy Agency (http://www.oecd-nea.org/).

ISTAT energy data

Description of the database: I.Stat is the warehouse of statistics currently produced by the Italian National Institute of Statistics.Statistics are searchable by theme. The system is also consultable by keyword. Data are presented in multidimensional tables which users can export in xls, csv formats. Acting on variables, reference periods and the arrangement of heads and sides, moreover, it is possible to obtain custom tables. Through a web service that allows direct machine-to-machine questioning, organizations, as well as private citizens, can form specific query data, download results. The service is accessible at the following address: https://www.istat.it/it/metodi-e-strumenti/web-service-sdmx

Rationale for choosing this database:

  • It’s a complete overview of energy consumption in Italy in all sectors of the society.

SETIS Innovation and Research data

Description of the database:  The level of investment in R&I in terms of both private (expenditure by businesses and industry) and public (Member States' national programmes and instruments), and trends in patents, for the Integrated SET Plan Actions.

Rationale for choosing this database: The Integrated SET Plan needs to be underpinned by an effective monitoring and reporting scheme that supports the development and implementation of the European R&I Agenda. SETIS plays a central role in the successful implementation of the Integrated SET Plan 10 key actions. In particular, the enhanced SETIS follows the objectives below:

  • To monitor R&I activities, investments and technology progress in Europe;
  • To contribute to the identification of gaps in the implementation of the actions;
  • To report on the overall progress towards the common goals set out in the integrated SET-Plan;
  • To contribute to assessing the impact of the SET-Plan on European competitiveness;
  • To make recommendations for actions that could further increase the effectiveness of the European R&I Strategy.

ENERGISE Online Database

Description of the database:   The Database is an output of the H2020 Energise Project. The project aims at testing household and community-level initiatives to reduce energy consumption through energy cultures. An output of the project forms the database that contains data on 1067 energy initiatives from 30 European countries.

Rationale for choosing this database:

  • The database contains recent information on a considerable number of (1067) initiatives pertaining to household and community-level energy policies. Thus, its representativeness is for European countries is high.
  • Since the focus of the EERAdata project is mainly concerning the European Union countries, this database is a good fit in terms of scope.
  • The database hosts information that would be gathered from national sources of each country, hence may be considered as a repository for national databases.

MURE Online Database

Description of the database:   MURE (Mesures d'Utilisation Rationnelle de l'Energie) is a comprehensive database that lists energy efficiency policies and measures regarding the Member States of the European Union. It allows for the analysis of policies based on topics, policy interaction, policy scoring, impact evaluation, as well as successful policies. The information is accessible by query in the database. The contributors are French Environment and Energy Management Agency, European Energy Network, and the H2020 program of the European Union.

Relates to the Odyssee database on energy-efficiency indicators and energy consumption.

Reference: Odyssee-MURE project

Rationale for choosing this database:

  • A comprehensive database including energy efficiency policies and measures in the European Union member states.
  • Since the focus of the EERAdata project is mainly concerning the European Union countries, this database is a good fit in terms of scope.
  • The database hosts information that would be gathered from national sources of each country, hence may be considered as a repository for national databases.
  • The database pertains to both Uses Case 4 on Energy Policies and Use Case 1 on Buildings Efficiency.

JRC-database hub: Joint Research Centre Data Catalogue

Description of database:   The JRC Science Hub aims to gradually integrate and aggregate all of the European Commission's science related activities, tools, laboratories, facilities, databases and networks. The inventory is wide range of the compilation of databases development of software and modelling tools. Published on the basis of open data principles with clear descriptions for each entry. The tools and databases are categorised by name and acronym, but can be filtered by research area, keyword and JRC institute responsible for the coordination of the particular entry. It contains, inter alia, databases related to energy and transport and climate. Rationale for choosing this database:

  • It’s the open and highly recognizable European Union’s database directly related to different related aspects also energy policy related.
  • It’s data interpretation is highly dependent on changes in the legal and policy frameworks across European Union.
  • The database enhances the transparency and openness of the JRC and further enables the open access policy of scientific research data.
  • The information and links provided in the metadata are maintained in distributed and heterogeneous information systems. Although the datasets are maintain and links and information are updated.
  • Sources of the database are coming in a variety of European languages, all brought together in an English database. Thus, it helps to investigate the relevance for FAIRification and opening activities.
  • The database is cross-use case relevant, as it contains policies and measures also for other use cases.

DataCite - an international not-for-profit organization aiming to improve data citation

Description of database: The main aim of DataCite is to provide the means to create, find, cite, connect, and use research. The primary service provided by DataCite is the registration of DOI tags. In addition, DataCite is constantly developing additional services related to DOI and related metadata, as following:

  • DataCite Search (https://search.datacite.org/) - allows to search the database of DOI metadata registered in DataCite.
  • DataCite Profiles (https://profiles.datacite.org/) - allows for automatic data transfer between the DataCite and ORCID systems, facilitating the update of personal profiles of researchers in ORCID with information about objects for which DOIs have been registered in DataCite.
  • DataCite Statistics (https://stats.datacite.org/) - a service presenting statistical data on the number of registered DOI, their metadata records and the use of this information.
  • DataCite OAI-PMH Provider (https://oai.datacite.org/) - an access interface to metadata collected in DataCite using the OAI-PMH protocol.

In addition, DataCite is involved in international projects and initiatives related to sharing and quoting research data, such as:

  • The worldwide Registry of Research Data Repositories - Re3Data (https://www.re3data.org/).
  • The Initiative for Open Citations (I4OC) = a collaboration between scholarly publishers, researchers, and other interested parties to promote the unrestricted availability of scholarly citation data and to make these data available.

Rationale for choosing this database:

  • The inventory contains heterogeneous and dynamic data. Thus, it helps us to investigate how to approach FAIRification for such data.
  • DataCite provides an integrated search interface, where it is possible to search and filter a relevant metadata gathered.
  • Sources of the database are coming in a variety of European languages. Thus, it helps to investigate the relevance for FAIRification and opening activities.
  • The database may be cross-use case relevant, as it contains DOIs and citations relevant also for other use cases.

Notes from the afternoon session

The afternoon session is dedicated to discussing metadata for the selected databases. The aim of the afternoon session is to fill out table 2 of the wiki page for the use case and to decide what to report back from the use case to the plenary session the next day (again, see the WIKI template for a suggested structure). Thus, at the end of the day, the WIKI page for the use case is complete.

Assessment of metadata for databases

  1. IEA Policy database
  2. COMETS
  3. JRC database hub
  4. EUR-Lex

IEA Policy database

  • Preliminary assessment of level of implementation of FAIR/O principles: F4 - yes; A1 - yes, A2 - yes (only depends on EU policy); I1 - yes; I2 - fulfilled, I3 - fulfilled; R1.1 - yes; R1.2 - no; R1.3 - not fulfilled, not even the DC standards.
  • Metadate seems to be spread.
  • Discussion on understanding of 100% fulfillement, on the example: In ideal database all country names would be linked to official registry --> you could go for all information. It would be 100% FAIR. Are any bases like that? Example on country information, which should be easily available automatically, for example ISO code for the country. IEA policy database is not fully like that.
  • Difficulties even in defining "policy" = everything (every official document) which is published in official governmental page.
  • Testing on the example of Poland and National Energy Efficency Action Plan for Poland: link to the official governmental page, not the direct link to the official document or to the official country law. When information about the date of update - should be also a linked to the local page (to ensure that the is a possibility to check at the source, if the policy has been already updated.
  • If metadata doesn't provide the possibility of reachning the sources (if the documents are there, but you don't have an opportunity to get to the official, legal document).

COMETS

  • Information on collective actions initiatives for the energy transition for all EU countries.
  • 3700 initiatives in selected European countries - Germany, Belgium, Czech Republic, Switzerland, Denmark.
  • Based on the internet desk research, business registers, national electricity production statistics, individual websites etc.
  • Energy production units / electriciity facilitiation units - including more than 6000.
  • Within the project in six countries (DE, ES, BE, PL, EE, NL) they are conducting survey about the development and values (export, import). The data from it will be also included into the database.
  • As it's still under development - FAIRness is 0% for now.
  • From COMETS project perspective, an important step for FAIRification is to facilitate by categorization, what and how specific activities are currently being implemented in different countries, categorization of them, taxonomy.

General conclusions

  • Discussion on Wilkinson FAIR priniciples --> if there are any databases which are 100% in accordance to these principles. 100% is an utopia. We only can make a hierarchy, which databases are matching these criteria better, or worse.
  • In Mons system - it's more clear that you have to achieve specific level, to be able to go further.
  • Two possible approaches: (1) start of absolute lowest datasets and try to FAIRify those, and later on move "up". (2) on the other hand FAIRification of database which are more complex is leading us to results / conlcussions available also for less complicated databases.
  • Importance of formats of data. It is important that the format of the data to be entered in specific entry is strictly defined (example - the date: need of clear definitions - what's to include: year, month, day specific?) Similarly - with longitude, latitute. If it's not done in advance, later on it takes much work to manually change it and put into right format. On the other hand - potentially - making a very specific description - may exclude some data, which are available but in different standards - for ecxample between EU and American standards.
  • Every entry has its provenence, with information how the entry hhas been changed, when, etc. Basically all entries should have their own metadata.
  • Two dimensions of policies:

1) policy domain (topic of the policy) 2) policy perspective / the level of goverment (for whom it's developed = local regional, national, international level etc.).

Directions in building taxonomy

  • Example to be watched: EU taxonomy for green innovation companies
  • UC4 taxonomy needs to match to higher level = overall level of the projects. To make sure, that the complete taxonomy is

Proposed logical aspects / steps:

  1. Starting on agreement of keywords + the glossry, to ensure common understanding.
  2. Taking into consideration two dimensions of policies: (1) policy domain and (2) policy level of goverment.
  3. Coordinating with other use cases: UC4 is affecting other use cases, so our keywords should not only match, but sometimes should be even taken from other use cases.

What to report back to the plenary on Day 3?

Databases selected (names and short reasoning)

  1. IEA Policy database
  2. COMETS
  3. JRC database hub
  4. EUR-Lex

Main insights from discussions

  • Two categories of databases:
  1. policy relevant databases, eg. COMETS, EU Merci, PETA4
  2. policy databases, eg. IEA Policy database, EUR-Lex, OECD, RES-Legal, MURE
  • Two dimensions of policies:
  1. policy domain (topic of the policy)
  2. policy perspective / the level of goverment (for whom it's developed = local regional, national, international level etc.).
  • For more - see the notes above.

Suggested next steps

  • Developing low-level metadata concepts (descriptive, structural, administrative metadata) and the identification of gaps & needs for their implementation.
  • Integration of existing standards, development of new approaches and vocabularies & taxonomies (e.g., considering ways to find data and research results through keywords, keywords used by peer-reviewed journals and mental metadata models of domain experts).
  • Proposed logical aspects / steps of building taxonomy:
  1. Starting on agreement of keywords + the glossry, to ensure common understanding.
  2. Taking into consideration two dimensions of policies: (1) policy domain and (2) policy level of goverment.
  3. Coordinating with other use cases: UC4 is affecting other use cases, so our keywords should not only match, but sometimes should be even taken from other use cases.
  • Conclusions for WP2:
  1. Methodology aspects - COMETS,
  2. Extending existing metadata - JP Wind,
  3. Providing recommendations for machine accessability - IEA Policy database.

Other issues