Loughborough University
Leicestershire, UK
LE11 3TU
+44 (0)1509 263171
Loughborough University

Loughborough University Institutional Repository

Please use this identifier to cite or link to this item: https://dspace.lboro.ac.uk/2134/24454

Title: The “dirty dozen” of freshwater science: Detecting then reconciling hydrological data biases and errors
Authors: Wilby, Robert L.
Clifford, Nicholas
De Luca, Paolo
Harrigan, Shaun
Hillier, John K.
Hodgkins, Richard
Johnson, Matthew F.
Matthews, Tom K.R.
Murphy, Conor
Noone, Simon
Parry, Simon
Prudhomme, Christel
Rice, Stephen P.
Slater, Louise
Smith, Katie
Wood, Paul J.
Keywords: Data biases
Exploratory data analysis
Detection
Attribution
Hydrological change
Homogeneity
Issue Date: 2017
Publisher: Wiley
Citation: WILBY, R.L. ...et al., 2017. The “dirty dozen” of freshwater science: Detecting then reconciling hydrological data biases and errors. Wiley Interdisciplinary Reviews (WIREs) Water, 4 (3), e1209.
Abstract: Sound water policy and management rests on sound hydrometeorological and ecological data. Conversely, unrepresentative, poorly collected or erroneously archived data introduces uncertainty regarding the magnitude, rate and direction of environmental change, in addition to undermining confidence in decision-making processes. Unfortunately, data biases and errors can enter the information flow at various stages, starting with site selection, instrumentation, sampling/ measurement procedures, post-processing and ending with archiving systems. Techniques such as visual inspection of raw data, graphical representation and comparison between sites, outlier and trend detection, and referral to metadata can all help uncover spurious data. Tell-tale signs of ambiguous and/or anomalous data are highlighted using 12 carefully chosen cases drawn mainly from hydrology (‘the dirty dozen’). These include evidence of changes in site or local conditions (due to land management, river regulation or urbanisation); modifications to instrumentation or inconsistent observer behaviour; mismatched or misrepresentative sampling in space and time; treatment of missing values, post-processing and data storage errors. As well as raising awareness of pitfalls, recommendations are provided for uncovering lapses in data quality after the information has been gathered. It is noted that error detection and attribution are more problematic for very large data sets, where observation networks are automated, or when various information sources have been combined. In these cases, more holistic indicators of data integrity are needed that reflect the overall information life-cycle and application(s) of the hydrological data.
Description: This is an open access article under the terms of the Creative Commons Attribution-NonCommercial License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
Sponsor: This paper was inspired by a capacity building program supported by the European Bank for Reconstruction and Development. Data for Exhibits #3, #5 and #12 are freely available from the UK NRFA. Other public data sources are acknowledged in Figure legends. Authors CP, KS, SH and SP are supported by the NERC-CEH Water Resources Science Area
Version: Published
DOI: 10.1002/wat2.1209
URI: https://dspace.lboro.ac.uk/2134/24454
Publisher Link: http://dx.doi.org/10.1002/wat2.1209
ISSN: 2049-1948
Appears in Collections:Published Articles (Geography)

Files associated with this item:

File Description SizeFormat
wat21209.pdfPublished version2.18 MBAdobe PDFView/Open

 

SFX Query

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.