Review and Apply Any Site-specific Criteria to Monitoring Location Sites or Assessment Units

This function will pull in all unique MonitoringLocationName, MonitoringLocationType, and MonitoringLocationIdentifier from the TADA dataframe and join it to TADA_UsesForAnalysis. Users are not required to provide a crosswalk between WQP Monitoring locations and Assessment units if they are only interested in summarizing assessments on a monitoring location level.

Usage

TADA_MLSummary(
  .data,
  org_id = NULL,
  usesRef = NULL,
  AUMLRef = NULL,
  AU_UsesRef = NULL,
  MLSummaryRef = NULL,
  displayNA = FALSE,
  excel = FALSE,
  overwrite = FALSE
)

Arguments

.data: A TADA dataframe after all desired data cleaning, processing, harmonization, filtering, and censored data handling functions have been applied.
org_id: The ATTAINS organization identifier must be supplied by the user. "USEPA" may be included as an org_id which will populate the EPA 304(a) recommended criteria for any TADA.CharacteristicName if one is found. "All" or "NULL" are also allowable values and may be helpful for new ATTAINS users or those performing assessments for multiple states and tribes. If "All" is selected, this will return all prior ATTAINS information from all ATTAINS organizations in prior ATTAINS assessment cycles as individual rows for each organization. If "NULL" is selected all unique prior ATTAINS information from any ATTAINS organizations are returned but are not labeled and can be manually edited. Enter rExpertQuery::EQ_DomainValues("org_id") into the console to get a list of valid organization identifiers. A list of organization identifiers can also be found by downloading the ATTAINS Domains Excel file: https://www.epa.gov/system/files/other-files/2025-02/domains_2025-02-25.xlsx. Organization identifiers are listed in the "code" column of the "OrgName" tab.
usesRef: A data frame which contains a completed crosswalk of ATTAINS.ParameterName(s) that will be analyzed for each ATTAINS.UseName. Users will need to ensure this crosswalk contains the appropriate column names in order to run the function. Users who have previously completed this crosswalk table can re-use it and review this output for accuracy.
AUMLRef: An optional data frame input. If provided, this data frame should contain a completed crosswalk of monitoring location sites associated with an assessment unit. This data frame must contain the following column names which can be generated from the output of TADA_CreateAUMLCrosswalk: ATTAINS.OrganizationIdentifier, TADA.MonitoringLocationIdentifier, ATTAINS.AssessmentUnitIdentifier, and ATTAINS.WaterType.
AU_UsesRef: An optional data frame input. If provided, the ATTAINS.UseName will be populated from the ATTAINS.UseName found in this data frame rather than the ATTAINS assessment profile. This data frame must contain the following column names which can be generated from the output of TADA_AssignUsesToAU: ATTAINS.OrganizationIdentifier, ATTAINS.AssessmentUnitIdentifier, ATTAINS.UseName, and ATTAINS.WaterType.
MLSummaryRef: An optional data frame which contains the completed spatial crosswalk to assign any unique spatial criteria to a parameter, use, waterbody or monitoring site/assessment unit. If provided the data frame must contain these columns: "ATTAINS.OrganizationIdentifier", "ATTAINS.AssessmentUnitIdentifier", "MonitoringLocationIdentifier", "MonitoringLocationTypeName", "TADA.ComparableDataIdentifier", "ATTAINS.ParameterName", "ATTAINS.UseName", "ATTAINS.WaterType", "SaltFresh", "DepthCategory", "LongitudeMeasure", "LatitudeMeasure", "IncludeOrExclude" and "UniqueSpatialCriteria".
displayNA: A boolean value. If TRUE, this allows user to view MLSummaryRef for all uses and parameter assigned to a ML or AU regardless if that site contains WQP data for that parameter. This is useful if a user is interested in an explicit list of everything that will be analyzed. Default is FALSE.
excel: A Boolean value that returns an excel spreadsheet if excel = TRUE. This spreadsheet is created in the user's downloads folder path. If you have any trouble locating the file, please type the following into your R console to locate it: file.path(Sys.getenv("USERPROFILE"), "Downloads"). The file will be named "myfileRef.xlsx". The excel spreadsheet will highlight the cells in which users should input information.
overwrite: A Boolean value. If overwrite = TRUE, the excel file will be replaced (overwritten) by the new file you create if you re-run this function. Users should only specify overwrite = TRUE once they are ready to re-run this function if they have already ran it once.

Value

A data frame with any unique spatial descriptions defined with columns: "ATTAINS.OrganizationIdentifier", "ATTAINS.AssessmentUnitIdentifier", "MonitoringLocationIdentifier", "MonitoringLocationTypeName", "TADA.ComparableDataIdentifier", "ATTAINS.ParameterName", "ATTAINS.UseName", "ATTAINS.WaterType", "SaltFresh", "DepthCategory", "LongitudeMeasure", "LatitudeMeasure", "IncludeOrExclude" and "UniqueSpatialCriteria".

Details

If users are interested in summarizing water quality data results by Assessment Units, users will need to provide an AUMLRef and AU_UsesRef file which (see TADA Module 2 tools) to assist in their monitoring location to assessment unit crosswalk (see TADA_GetATTAINSAUMLCrosswalk, TADA_CreateAUMLCrosswalk, and TADA_GetATTAINSByAUID) and uses to assessment unit crosswalk (see TADA_CreateWaterusesRef and TADA_AssignUsesToAU) prior to this step.

Users can apply any unique site-specific criteria (for example, warm waters, cold waters, water classifications, species-based waters, ecoregions etc.) to any monitoring location sites or assessment units as needed. Users are recommended to utilize the excel file for easy filtering across columns to apply any site specific criteria as needed.

Examples

if (FALSE) { # \dontrun{
# First, generate and fill out a parameter crosswalk (see TADA_ParametersForAnalysis()):
paramRef_UT <- TADA_ParametersForAnalysis(Data_Nutrients_UT, org_id = "UTAHDWQ", excel = FALSE)
paramRef_UT2 <- dplyr::mutate(paramRef_UT, ATTAINS.ParameterName = dplyr::case_when(
  grepl("AMMONIA", TADA.ComparableDataIdentifier) ~ "AMMONIA, TOTAL",
  grepl("NITRATE", TADA.ComparableDataIdentifier) ~ "NITRATE",
  grepl("NITROGEN", TADA.ComparableDataIdentifier) ~ "NITRATE/NITRITE (NITRITE + NITRATE AS N)"
))
paramRef_UT3 <- TADA_ParametersForAnalysis(
  Data_Nutrients_UT,
  paramRef = paramRef_UT2, org_id = "UTAHDWQ", excel = FALSE
)

# Next, enter the crosswalk generated above as the paramRef function input
# for TADA_UsesForAnalysis():
usesRef_UT <- TADA_UsesForAnalysis(
  Data_Nutrients_UT,
  paramRef = paramRef_UT3, org_id = c("UTAHDWQ"), excel = FALSE
)

# Now, run TADA_MLSummary()
MLSummaryRef_UT <- TADA_MLSummary(
  Data_Nutrients_UT,
  org_id = c("UTAHDWQ"),
  AU_UsesRef = NULL, AUMLRef = NULL,
  usesRef = usesRef_UT,
  excel = FALSE
)
} # }