Skip to contents

EPA's Water Quality Exchange (WQX) has generated maximum and minimum thresholds for each parameter and unit combination from millions of water quality data points around the country. This function leverages the WQX QAQC Validation Table to flag any data that is below the lower threshold of result values submitted to WQX for a given characteristic.

Usage

TADA_FlagBelowThreshold(.data, clean = FALSE, flaggedonly = FALSE)

Arguments

.data

TADA dataframe

clean

Boolean argument; removes data that is below the lower WQX threshold from the dataframe when clean = TRUE. Default is clean = FALSE.

flaggedonly

Boolean argument; filters dataframe to show only the data flagged as below the lower WQX threshold. Default is flaggedonly = FALSE.

Value

The input TADA dataset with the added "TADA.ResultValueBelowLowerThreshold.Flag" column which is populated with the values: "Pass", "Suspect", "Not Reviewed", or "NA - Not Available". Defaults are clean = FALSE and flaggedonly = FALSE. When clean = FALSE and flaggedonly = TRUE, the dataframe is filtered to show only data found below the WQX threshold. When clean = TRUE and flaggedonly = FALSE, rows with values that are below the lower WQX threshold are removed from the dataframe. When clean = TRUE and and flaggedonly = TRUE, the function is not executed and an error message is returned.

Details

When clean = FALSE and flaggedonly = FALSE, a column which flags data below the lower WQX threshold is appended to the dataframe. When clean = FALSE and flaggedonly = TRUE, the dataframe is filtered to show only data found below the WQX threshold. When clean = TRUE and flaggedonly = FALSE, rows with values that are below the upper WQX threshold are removed from the dataframe and no column is appended. When clean = TRUE and and flaggedonly = TRUE, the function is not executed and an error message is returned. Defaults are clean = FALSE and flaggedonly = FALSE.

This function will add the column "TADA.ResultValueBelowLowerThreshold.Flag" which will be populated with the values: "Pass", "Suspect", "Not Reviewed", or "NA - Not Available". The “Not Reviewed” value means that the EPA WQX team has not yet reviewed the range yet for the characteristic and unit combination combination in that row (see https://cdx.epa.gov/wqx/download/DomainValues/QAQCCharacteristicValidation.CSV). The WQX team plans to review and update these new combinations quarterly. The "NA - Not Available" flag means that the characteristic, media, and/or unit combination for that row is not fully populated (is NA or does not match the WQX data standard) or the result value is NA.

If this function is run more than once on the same dataframe, the flag column will be deleted and regenerated.

Examples

# Load example dataset:
data(Data_Nutrients_UT)

# Remove data that is below the lower WQX threshold from the dataframe:
WQXLowerThreshold_clean <- TADA_FlagBelowThreshold(Data_Nutrients_UT, clean = TRUE)
#> [1] "No data below the WQX Lower Threshold were found in your dataframe. Returning the input dataframe with TADA.ResultValueBelowLowerThreshold.Flag column for tracking."

# Flag, but do not remove, data that is below the lower WQX threshold in
# new column titled "TADA.ResultValueBelowLowerThreshold.Flag":
WQXLowerThreshold_flags <- TADA_FlagBelowThreshold(Data_Nutrients_UT, clean = FALSE)
#> [1] "No data below the WQX Lower Threshold were found in your dataframe. Returning the input dataframe with TADA.ResultValueBelowLowerThreshold.Flag column for tracking."

# Show only data that is below the lower WQX threshold:
WQXLowerThreshold_flagsonly <- TADA_FlagBelowThreshold(Data_Nutrients_UT, clean = FALSE, flaggedonly = TRUE)
#> [1] "This dataframe is empty because no data below the WQX Lower Threshold was found in your dataframe"