Talk:WS1UC2
As a part of preparation tasks, each project partner selected and characterize 2 databases thematically related to the area of power transmission and distribution networks. Therefore, as part of the preliminary work 8 databases were selected and used in further analysis during the workshop (databases verified: OPSD, EIA, . OpenEI, PSE, NREL, ENTSO-E, SMARD, The World bank data). During the workshop the databases were analyzed based on the methodology developed by: Swiss National Science Fundation (Explanation of the FAIR data principles-Wilkinson et al. (2016), The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data 3, doi:10.1038/sdata.2016.18).
http://www.snf.ch/SiteCollectionDocuments/FAIR_principles_translation_SNSF_logo.pdf
As part of conducted assessment the databases have been characterized, the percentage of compliance with FAIR/O criteria was indicated, and base elements/ requiring improvement were indicated.
A summary of the analysis performed for each database is provided below.
- DATABASE 1: ENTSO-E
- Short description: Central collection and publication of electricity generation, transportation and consumption data and information for the pan-European market. Main data categories are: Load, generation, transmission, balancing, outages, congestion management, system operations. It covers data from the starting year 2014 in hourly resolution.
- Current State of FAIR/O pronciples:F1 50 % - F2 50% - F3 20% - F4 100% A1.1 100% - A1.2 100% - A2 50% I1 100% - I2 80% - I3 100% R1.1 50% - R1.2 50% - R1.3 90%.
- Target of RAIR/O F1: Assign data set a globally unique and persistent identified (e.g. DOI).
- Indication of the elements requiring improvement:
- Findable: Improve F2: data should be better described with rich metadata and Improve F3: downloaded data should include explicitly the identifier of the data (e.g. source);
- Accessibility: Improve A2: not sure, no information regarding if metadata are accessible even when the data are no longer available.; Interoperability:
- Improve I3: data should include qualified references to other data (which TSO or national authority are the main reference?). Also improve I2: There is a long list of databases under sitemap, this can be linked also explanations;
- Reusability: improve R1.1 and R1.2: it should contain more clear and accessible data usage license and associated provenance. regarding license, there is only information in pdf. under “terms and conditions” it states not sub-licensable or transferrable. It needs to clarify licensing issue regarding different data origins.
- DATABASE 2: SMARD (Strommarktdaten)
- Short description: Electricity market information platform of German Federal Network Agency Bundesnetzagentur (BnetzA). It presents the most important electricity market data for Germany containing electricity market data such as electricity generation, consumption, import and export, market balancing and power plants in different periods of time (i.e. power plants data between 2015-2025 in hourly resolution, generation in 15 min.).
- Current State of FAIR/O pronciples:F1 100% - F2 100% - F3 100% - F4 100% - A1.1 100% - A1.2 100% - A2 100% I1 100% - I2 80% - I3 100% R1.1 100% - R1.2 100% - R1.3 90%
- Target of RAIR/O:
- Indication of the elements requiring improvement: Improve I2 for using vocabularies that follow FAIR principles.
- DATABASE 3: NREL Transforming Energy
- Short description: NREL develops data and tools for the analysis of grid technologies and strategies, including renewable resource data sets and models of the electric power system.
- Current State of FAIR/O pronciples: F1 100% - F2 100% - F3 80% - F4 100% - A1.1 100% - A1.2 100% - A2 100% - I1 80% - I2 80% - I3 100% - R1.1 100% - R1.2 100% - R1.3 80%.
- Target of RAIR/O: A1.2 the protocol allows for an authentication and authorization
- Indication of the elements requiring improvement: ............
- DATABASE 4: PSE
- Short description: Energy sector database for Poland. The database.The scope and information presented in the database includes: Polish Power system operation; balancing market operation and reports map.
- Current State of FAIR/O pronciples: F1 50% - F2 50% - F3 20% - F4 50% / A1.1 100% - A1.2 20% - A2 50% / I1 100% - I2 80% - I3 80% / R1.1 20% - R1.2 50% - R1.3 100%.
- Target of RAIR/O: Improve F2 data are described with rich metadata
- Indication of the elements requiring improvement:
- Findable: Improve F1,F2,F3, e.g needs unique identifier and more searchable form.
- Accessibility: Improve A1.2 and A2, e.g. no information regarding if metadata are accessible even when the data are no longer available. Also, there is no authentication and authorization procedure, where necessary.
- Interoperability: Slightly improve I2 and I3. e.g to include qualified references to other (meta)data
- Reusability: improve R1.1 and R1.2. e.g. the usage license is not clear, also it is not clear how (meta)data are associated with detailed provenance.
- DATABASE 5: OpenEI
- Short description: OpenEI is developed and maintained by the National Renewable Energy Laboratory with funding and support from the U.S. Department of Energy since 2017. The platform is wiki based. Users can view, edit, and add data – and download data for free.
- Current State of FAIR/O pronciples: F1 50% - F2 70% - F3 70% - F4 50% / A1.1 100% - A1.2 20% - A2 50%/ I1 50% - I2 50% - I3 100% / R1.1 100% - R1.2 50% - R1.3 100%
- Target of RAIR/O: ........
- Indication of the elements requiring improvement: ..........
- DATABASE 6: EIA
- Short description: The U.S. Energy Information Administration (EIA) collects, analyzes, and disseminates independent and impartial energy information to promote sound policymaking, efficient markets, and public understanding of energy and its interaction with the economy and the environment.
- Current State of FAIR/O pronciples: F1 50% - F2 50% - F3 30% - F4 50% / A1.1 100% - A1.2 20% - A2 50%/ I1 100% - I2 100% - I3 50% / R1.1 50% - R1.2 50% - R1.3 100%
- Target of RAIR/O:
- Indication of the elements requiring improvement:
- Findable: Improve F1,F2,F4, e.g needs unique identifier
- Accessibility: Improve A2, e.g. no information regarding if metadata are accessible even when the data are no longer available. Also, there is no authentication and authorization procedure, where necessary.
- Interoperability: Improve I2, e.g. to include qualified references to other (meta)data
- Reusability: improve R1.2. it is not clear how (meta)data are associated with detailed provenance. Improve R1.1, clarify licensing issue regarding different data origins.
- DATABASE 7: OPSD
- Short description: The database is an outcome of the Open Energy Modelling Initiative that dates back to 2014. The aim is to construct an open platform for energy systems modelling data.
- Current State of FAIR/O pronciples: F1 100% - F2 100% - F3 100% - F4 100% / A1.1 100% - A1.2 20% - A2 100%/ I1 100% - I2 100% - I3 100% / R1.1 50% - R1.2 100% - R1.3 100%
- Target of RAIR/O:
- Indication of the elements requiring/nor requiring improvement:
- Improve I2 for using vocabularies that follow FAIR principles
- Findable: It is Ok.
- Accessibility: Improve A2, e.g. no information regarding if metadata are accessible even when the data are no longer available. Also, there is no authentication and authorization procedure, where necessary.
- Interoperability: It is OK.
- Reusability: Improve R1.1, clarify licensing issue regarding different data origins. Notice that OPSD is not owner of the data.
- DATABASE 8: Electric power transmission and distribution losses
- Short description: The data series is a part of the World Bank’s initiative for free and open access to global development data.
The data series is also exploitable via DataBank, which is an analysis and visualisation tool allowing to create queries, generate tables, charts, and maps.
- Current State of FAIR/O pronciples:
- Findability: (meta)data are assigned a globally unique and persistent identifier, Accessibility: A1.1 the protocol is open, free, and universally implementable, Interoperability: I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation, Reusability: meta(data) are richly described with a plurality of accurate and relevant attributes.
- Current State of FAIR/O pronciples:
As mentioned during the workshop databases presented above were assessed with respect to their current meta practices and FAIR requirements. The detabases seleted to farther analyses are presented below:
- OPSD -
- Type of metadata provided: Administrative 100%, Descriptive 100%, Structural 100%.
- Technical implementation of metadata: xml, plain text, RDF, xml, csv, json, jupyter, etc.
- Database summary: database identified as a good example of FAIR and open access databse with proper metadata.
- EIA -
- Type of metadata provided: Administrative 80%, Descriptive 100%, Structural 100%.
- Technical implementation of metadata: xml, csv, text, PDF.
- Database summary: identified as a rich source of national data with open access.
- PSE -
- Type of metadata provided: Administrative 80%, Descriptive 90%, Structural 100%.
- Technical implementation of metadata: xls, csv, pdf, etc.
- Database summary: database identified as a rich source of national/local (Poland) database with open access.
- ENTSO-E
- Type of metadata provided: Administrative 50%, descriptive 50%, structural 50%.
- Technical implementation of metadata:csv, xlsx, xml, xml zip, graphs and charts.
- Database summary: database identified as a major, very rich source of power networks data (Europe range). An additional advantage of the database is open access.
PROBLEMATIC ISSES:
During the analysis, the problems related to incorrect interpretation of the FAIR/O assessment methodology were identified. The following aspects requiring deeper analysis nad discussion:
- deeper analysis is required for the criteria R1.1 "(Meta)data are released with a clear and accessible data usage licences - whether the inability to login with simultaneous open access to the data should be understood as an adventage or disadventage in terms of the criteria.This issues ic corelated with legal aspects of using data.
- Comment: For example the lack of a license creates a problem and often prevents the use of data.
- (https://open-power-system-data.org/legal#Licenses_for_open_data)
- deeper analysis is required for the criteria A1.2 "The protocol allows for an authentication and authorisation where necessary" - how to interpret the possibility or the inability to create an account by the database user?
- (meta)data frameworks should be described and clarified
- deeper analysis is required for the criteria R1.1 "(Meta)data are released with a clear and accessible data usage licences - whether the inability to login with simultaneous open access to the data should be understood as an adventage or disadventage in terms of the criteria.This issues ic corelated with legal aspects of using data.