This function checks the validity of each characteristic and result unit
combination in the input data frame. By default, rows are flagged but not removed.
The full input data frame is returned along with an additional flag column,
'TADA.ResultUnit.Flag', unless clean is set to 'both', in which case the flag
column is excluded.
Arguments
- .data
A data frame containing the TADA dataset.
- clean
A character argument with options 'suspect_only', 'nonstandardized_only', 'both', or 'none'. The default is 'none', which retains all rows but flags them.
- flaggedonly
A boolean argument; filters the data frame to show only 'Suspect' and 'NonStandardized' characteristic-media-result unit combinations when
TRUE. Default isFALSE. This can only beTRUEifcleanis set to 'none'.
Value
The function returns the input data frame with an added 'TADA.ResultUnit.Flag'
column unless clean is 'both', in which case the column is excluded. This column
flags each 'TADA.CharacteristicName' and TADA.ResultMeasure.MeasureUnitCode'
combination as 'NonStandardized', 'Suspect', 'Pass', or 'Not Reviewed'.
When clean = 'none' and flaggedonly = TRUE, the data frame is filtered to show only
the 'Suspect' and 'NonStandardized' data.
Details
Users can choose to filter and review only the flagged rows by setting
flaggedonly to TRUE. After review, users can choose to remove any rows flagged
as 'Suspect' or 'NonStandardized' in the 'TADA.ResultUnit.Flag' column by setting
clean to 'suspect_only', 'nonstandardized_only', 'both', or 'none'.
Note: The 'Not Reviewed' value in the 'TADA.ResultUnit.Flag' means that the
EPA WQX team has not yet reviewed the combination for validity.
Examples
# Load example dataset:
utils::data(Data_R5_TADAPackageDemo)
# Flag, but do not remove, data with 'Suspect' or 'NonStandardized'
# characteristic and unit combinations in a new column titled
# 'TADA.ResultUnit.Flag':
SuspectUnit_flags <- TADA_FlagResultUnit(Data_R5_TADAPackageDemo)
# Show only 'Suspect' or 'NonStandardized' characteristic and unit combinations:
SuspectUnit_flaggedonly <- TADA_FlagResultUnit(Data_R5_TADAPackageDemo,
clean = "none", flaggedonly = TRUE
)
SuspectUnit_flaggedonly_selectcols <- dplyr::select(
SuspectUnit_flaggedonly,
TADA.CharacteristicName, TADA.ResultMeasure.MeasureUnitCode, TADA.ResultUnit.Flag
)
# Remove both 'Suspect' and 'NonStandardized' characteristic and result
# combinations, and exclude the flag column:
ResultUnit_clean <- TADA_FlagResultUnit(Data_R5_TADAPackageDemo, clean = "both")