Skip to contents

This function checks the validity of each characteristic and result unit combination in the input data frame. By default, rows are flagged but not removed. The full input data frame is returned along with an additional flag column, 'TADA.ResultUnit.Flag', unless clean is set to 'both', in which case the flag column is excluded.

Usage

TADA_FlagResultUnit(.data, clean = "none", flaggedonly = FALSE)

Arguments

.data

A data frame containing the TADA dataset.

clean

A character argument with options 'suspect_only', 'nonstandardized_only', 'both', or 'none'. The default is 'none', which retains all rows but flags them.

flaggedonly

A boolean argument; filters the data frame to show only 'Suspect' and 'NonStandardized' characteristic-media-result unit combinations when TRUE. Default is FALSE. This can only be TRUE if clean is set to 'none'.

Value

The function returns the input data frame with an added 'TADA.ResultUnit.Flag' column unless clean is 'both', in which case the column is excluded. This column flags each 'TADA.CharacteristicName' and TADA.ResultMeasure.MeasureUnitCode' combination as 'NonStandardized', 'Suspect', 'Pass', or 'Not Reviewed'. When clean = 'none' and flaggedonly = TRUE, the data frame is filtered to show only the 'Suspect' and 'NonStandardized' data.

Details

Users can choose to filter and review only the flagged rows by setting flaggedonly to TRUE. After review, users can choose to remove any rows flagged as 'Suspect' or 'NonStandardized' in the 'TADA.ResultUnit.Flag' column by setting clean to 'suspect_only', 'nonstandardized_only', 'both', or 'none'. Note: The 'Not Reviewed' value in the 'TADA.ResultUnit.Flag' means that the EPA WQX team has not yet reviewed the combination for validity.

Examples

# Load example dataset:
utils::data(Data_R5_TADAPackageDemo)

# Flag, but do not remove, data with 'Suspect' or 'NonStandardized'
# characteristic and unit combinations in a new column titled
# 'TADA.ResultUnit.Flag':
SuspectUnit_flags <- TADA_FlagResultUnit(Data_R5_TADAPackageDemo)

# Show only 'Suspect' or 'NonStandardized' characteristic and unit combinations:
SuspectUnit_flaggedonly <- TADA_FlagResultUnit(Data_R5_TADAPackageDemo,
  clean = "none", flaggedonly = TRUE
)
SuspectUnit_flaggedonly_selectcols <- dplyr::select(
  SuspectUnit_flaggedonly,
  TADA.CharacteristicName, TADA.ResultMeasure.MeasureUnitCode, TADA.ResultUnit.Flag
)

# Remove both 'Suspect' and 'NonStandardized' characteristic and result
# combinations, and exclude the flag column:
ResultUnit_clean <- TADA_FlagResultUnit(Data_R5_TADAPackageDemo, clean = "both")