WQX QAQC Service User Guide
TADA Team
2024-10-31
Source:vignettes/WQXValidationService.Rmd
WQXValidationService.Rmd
TADA Leverages the Water Quality eXchange (WQX) QAQC Service
This is a overview of the the WQX Quality Assurance and Quality Control (QAQC) data submission service, and how TADA leverages that service to flag potentially invalid data in the Water Quality Portal (WQP). It will cover: 1) an overview of all available WQX QAQC tests for data submissions, 2) which of these QAQC tests are also available in TADA for flagging potentially invalid WQP data, and 3) how to interpret and provide feedback on the validation reference tables referenced by WQX and TADA for this QAQC service.
Background
The WQX expectation for submissions is that users submit only QAQC’d data and utilize WQX elements to ensure the data is of “documented quality”. The WQX team has historically hosted data quality working groups aimed at creating best practices and required data elements for WQX 3.0 for specific parameter groups such as nutrients, metals and biological data. These resources have supported users to submit data of documented quality. This approach has been broadly successful, but because it is not an enforceable approach it has allowed for data of varying quality to be shared. In response, the WQX team recently released a QAQC service in 2022 to test and flag data submissions for potential errors, and provide the data submitter with a chance to make edits to improve the quality of their data, before it goes into the WQP. However the new WQX services will not flag or fix data historical that is already in the WQP. Therefore, the TADA R package leverages the WQX QAQC Validation Reference tables in the R package to make these same tests available on WQP outbound data using R.
Available Tests
Tests available in both WQX Web for data submissions & the TADA R Package for use with data retrieved from the WQP:
-
Tests that Leverage the WQX Validation Table
-
Flag invalid characteristic and unit combinations (e.g., flag data if mg/L or °C was used for pH; flag if units were left blank, NA, None, etc. but are required for the characteristic)
-
Flag data without methods, with uncommon methods, or with invalid methods for the characteristic. This is contingent on the activity type since some activity types don’t require methods.
-
Flag data with invalid characteristic and fraction combinations
-
Flag data with invalid characteristic and speciation combination
-
Flag results that do not have an approved QAPP and/or do not include a QAPP attachment
-
Flag results above or below national thresholds using the Interquartile Range (IQR) method. The IQR method is defined as the difference between the 75% percentile and the 25% percentile of your dataset.
-
Upper Threshold= 75th Percentile + 1.5 * (75th percentile - 25th percentile)
-
Lower Threshold= 25th Percentile - 1.5 * (75th percentile - 25th percentile)
-
-
Additional tests only available in WQX Web and Node Submissions (for now):
-
Location
Flag data where accuracy of lat/long is less than three three decimal degrees (e.g., 38.88°, -77.00°)
Flag data where the lat/long assigned to monitoring sites is NOT in the state and country included in metadata
Flag if monitoring locations nearby each other are likely the same based on three decimal degrees precision. Includes two flags - one within your org and one national (across all orgs). If they are, then provide user an option to use the existing monitoring location instead of creating a new one [expand here].
Providing Feedback on Validation Tables
All WQX Domain Tables are available HERE. TADA leverages many of the WQX domain tables.
-
QAQCCharacteristicValidation (ZIP) | (XML) | (CSV)
-
Both WQX and TADA leverage the table above to flag invalid and uncommon results. This reference table is used in the following TADA functions:
-
-
MeasureUnit (ZIP) | (XML)| (CSV)
Both WQX and TADA leverage the table above to convert all data for each unique characteristic to a consistent unit, so that results can then we assessed against the WQX upper and lower thresholds in the validation table. Target units for each charactersitic are included in the MeasureUnit domain table. TADA leverages this table to convert all data for each unique characteristic to a consistent target unit. See relevant TADA functions:
See TADA Function: TADA_ConvertResultUnits()
All TADA Reference and Validation Tables are also available in the R Package HERE. TADA pulls the WQX Validation Table and other domain tables into TADA and updates them automatically whenever changes are made
https://github.com/USEPA/EPATADA/blob/develop/inst/extdata/WQXcharValRef.csv
https://github.com/USEPA/EPATADA/blob/develop/inst/extdata/WQXunitRef.csv
WQX and TADA users can review and provide feedback on the validation table and/or target units assigned, and provide feedback by emailing the WQX helpdesk (WQX@epa.gov).