Database containing harmonized data sets
Main authors: | Marc Laurencelle, Nicolas Surdyk, Matjaž Glavan, Birgitte Hansen, Claudia Heidecke, Hyojin Kim, Susanne Klages |
FAIRWAYiS Editor: | Jane Brandt |
Source document: | »Laurencelle, M. et al 2021. (Short note for the) database containing harmonised datasets, 28 pp. FAIRWAY Project Deliverable 3.3 |
In this section of FAIRWAYiS we describe the preparation of harmonized datasets for water quality monitoring of drinking water resources, and the development of a readily usable database from these harmonized datasets.
A large range of environmental indicators has been considered for monitoring the quality of drinking water. Our focus in FAIRWAY has mainly been on indicators related to the monitoring of nitrate and pesticide application (»Agri-drinking water quality indicators and IT/sensor techniques) and the transport and fate in the hydrogeological system and in drinking water (»Link between agricultural pressure and drinking water quality state - lessons learned in Denmark and France).
The development of the database has mainly been driven by existing datasets coming from each of the FAIRWAY case studies (»Case studies). The database contains near 390,000 rows of data from the 13 case study sites, with more than 65 parameters and more than 500 sub-parameters.
One of the challenges throughout the task of database development has been to find ways to harmonize as much as possible the datasets obtained from those various sources.
We provide access to the database and describe it in terms of its general structure,
»Indicator database
describe its development,
»Database development process
and detailed structure.
»Detailed structure of the database
Possible uses of the database are then mentioned along with examples of some interesting data series and instructions on using the database efficiently.
»Using the database
Finally, the major problems and limitations encountered throughout this work are discussed.
»Conclusions
Some major challenges identified throughout this work are that:
- Definitions of ‘boundary’ are different from the pressure and state perspectives. The catchment area defines the hydrogeological boundary, but the agricultural boundary is an administrative boundary (at least are displayed as that). Moreover, there is generally a lag time (delay) between pressure and state indicators. Consequently, pressure data and state data do not overlap in most cases, and thus they cannot be linked directly.
- Because of the difference in those definitions, the scale of the collected data is also different. The state data (mainly hydrogeochemical data on water quality) can be point or catchment scale while the pressure data is ideally at the field plot scale but actually most often at administrative levels (municipal, regional, or even national level).
- Therefore, it is time-consuming to collect these large sets of data and process the data to a comparable form between state and pressure for a case study and between the case studies.