Process a TADA dataframe to flag or filter data by media type.
Usage
TADA_MediaFilter(
.data,
clean = FALSE,
surface_water = FALSE,
ground_water = FALSE,
sediment = FALSE,
other = FALSE
)Arguments
- .data
A data frame representing a TADA profile object.
- clean
Logical. If
TRUE, remove rows according to the media toggles and return data without flag columns. IfFALSE, only flag media and return all rows withTADA.Media.Flag. DefaultFALSE.- surface_water
Logical (used only when
clean = TRUE). IfTRUE, removeSURFACE WATERresults. DefaultFALSE.- ground_water
Logical (used only when
clean = TRUE). IfTRUE, removeGROUNDWATERresults. DefaultFALSE.- sediment
Logical (used only when
clean = TRUE). IfTRUE, removeSEDIMENTresults. DefaultFALSE.- other
Logical (used only when
clean = TRUE). IfTRUE, removeOTHERresults. DefaultFALSE.
Value
A data frame.
If
clean = FALSE, returns all rows with the columnTADA.Media.Flagadded.If
clean = TRUE, returns rows with the selected media removed and no flag columns added.If the input has 0 rows, returns
NULL.
Details
Behavior overview:
If
clean = FALSE, classify each row and addTADA.Media.Flag. The function prints a single message with counts by media and returns the original data with the flag.If
clean = TRUE, classify and then remove rows whose media types are selected via function toggles. The function prints a single message with counts by media before filtering, and then a single summary message:When no toggles are selected (all FALSE): informs the user and returns the original data without
TADA.Media.Flag.When some toggles are selected but no rows match (e.g., data already contain only SURFACE WATER): informs the user that nothing was removed and returns the data unchanged (without flag columns).
When rows are removed: reports how many rows were removed and returns cleaned data (without
TADA.Media.Flag). Additionally, a warning is issued if all media toggles areTRUE(which would remove all media types), and if the filter removes all rows.
Inputs used for classification:
MonitoringLocationTypeName,ActivityMediaSubdivisionNameandActivityMediaNameGroundwater-related fields:
AquiferName,AquiferTypeName,LocalAqfrName,ConstructionDateText,WellDepthMeasure.MeasureValue,WellDepthMeasure.MeasureUnitCode,WellHoleDepthMeasure.MeasureValue,WellHoleDepthMeasure.MeasureUnitCode
Classification details:
ActivityMediaName of "Soil", "Sediment" (and common variants like "Soil or Sediment") map to
SEDIMENTeven if groundwater fields are present.ActivityMediaSubdivisionNameis reviewed for identifying "Surface Water", "Groundwater", and "Sediment".If the subdivision is blank and
ActivityMediaNameis "Water" with no groundwater fields present, the row is classified asSURFACE WATER.Groundwater fields are considered present if they are non-NA and non-blank (for character/factor fields) or non-NA (for numeric fields).
If a monitoring location type reference is available via
TADA_GetMonLocTypeRef(), itsTADA.Media.Flagtakes precedence when present.Media flags are normalized to the core set:
SURFACE WATER,GROUNDWATER,SEDIMENT,OTHER. Values such asHABITAT,AIR,BIOLOGICAL, empty strings, or non-core values are coerced toOTHER.
Requirements and defaults:
Required columns:
MonitoringLocationTypeName,ActivityMediaSubdivisionName, andAquiferName. If any required columns are missing (rare), the function stops with an error of the form "Missing required columns:, , ...". Optional columns are created if missing (with appropriate types) and filled with
NA:ActivityMediaName,AquiferTypeName,LocalAqfrName,ConstructionDateText,WellDepthMeasure.MeasureValue(numeric),WellDepthMeasure.MeasureUnitCode,WellHoleDepthMeasure.MeasureValue(numeric),WellHoleDepthMeasure.MeasureUnitCode.If the input data frame has 0 rows, the function emits a message and returns
NULL.
Messages and warnings
Always prints counts by media before filtering.
When
clean = TRUE, prints a single summary message indicating whether rows were removed, whether no rows matched the selected media, or whether no toggles were selected.Warns if all media toggles are
TRUE, and if all rows were removed by the filter.
Examples
utils::data(Data_R5_TADAPackageDemo)
# Example 1: Do not clean; classify media and add TADA.Media.Flag
Data_Flag <- TADA_MediaFilter(
Data_R5_TADAPackageDemo,
clean = FALSE
)
#> TADA_MediaFilter: Counts by media - SURFACE WATER: 166,373 | GROUNDWATER: 3,315 | SEDIMENT: 245 | OTHER: 2,244
#> TADA_MediaFilter: Returning all results with TADA.Media.Flag; media toggles ignored because clean = FALSE.
unique(Data_Flag$TADA.Media.Flag)
#> [1] "SURFACE WATER" "GROUNDWATER" "SEDIMENT" "OTHER"
# Example 2: Remove groundwater and sediment; no flag column in output
Data_Clean1 <- TADA_MediaFilter(
Data_R5_TADAPackageDemo,
clean = TRUE,
ground_water = TRUE,
sediment = TRUE
)
#> TADA_MediaFilter: Counts by media (before filter) - SURFACE WATER: 166,373 | GROUNDWATER: 3,315 | SEDIMENT: 245 | OTHER: 2,244
#> TADA_MediaFilter: Removed 3,560 rows matching media types: GROUNDWATER, SEDIMENT. Returning cleaned data without flag columns.
"TADA.Media.Flag" %in% names(Data_Clean1) # FALSE
#> [1] FALSE
# Example 3: Keep only surface water by removing groundwater, sediment, and other
Data_Clean2 <- TADA_MediaFilter(
Data_R5_TADAPackageDemo,
clean = TRUE,
ground_water = TRUE,
sediment = TRUE,
other = TRUE
)
#> TADA_MediaFilter: Counts by media (before filter) - SURFACE WATER: 166,373 | GROUNDWATER: 3,315 | SEDIMENT: 245 | OTHER: 2,244
#> TADA_MediaFilter: Removed 5,804 rows matching media types: GROUNDWATER, SEDIMENT, OTHER. Returning cleaned data without flag columns.
# Example 4: Remove surface water only
Data_Clean3 <- TADA_MediaFilter(
Data_R5_TADAPackageDemo,
clean = TRUE,
surface_water = TRUE
)
#> TADA_MediaFilter: Counts by media (before filter) - SURFACE WATER: 166,373 | GROUNDWATER: 3,315 | SEDIMENT: 245 | OTHER: 2,244
#> TADA_MediaFilter: Removed 166,373 rows matching media types: SURFACE WATER. Returning cleaned data without flag columns.