Create or Update ATTAINS Parameter and Use crosswalk
Source:R/ATTAINSCrosswalks.R
TADA_UsesForAnalysis.RdThis function generates a crosswalk of all parameters and uses applicable to your WQP/TADA data frame and selected organization(s) in ATTAINS. Users should review and validate each ATTAINS.ParameterName and associated ATTAINS.UseName combination.As part of this review process, users should check to make sure each ATTAINS.UseName generated by this function accurately corresponds to the correct TADA.ComparableDataIdentifier and ATTAINS.ParameterName found in the TADA dataframe. This function should be run after creating your parameter (ATTAINS.ParameterName and TADA.ComparableDataIdentifier) crosswalk.
Usage
TADA_UsesForAnalysis(
.data,
org_id = NULL,
paramRef = NULL,
usesRef = NULL,
AU_UsesRef = NULL,
AUMLRef = NULL,
auto_assign = FALSE,
excel = FALSE,
overwrite = FALSE
)Arguments
- .data
A TADA dataframe after all desired data cleaning, processing, harmonization, filtering, and censored data handling functions have been applied.
- org_id
The ATTAINS organization identifier must be supplied by the user. "USEPA" may be included as an org_id which will populate the EPA 304(a) recommended criteria for any TADA.CharacteristicName if one is found. "All" or "NULL" are also allowable values and may be helpful for new ATTAINS users or those performing assessments for multiple states and tribes. If "All" is selected, this will return all prior ATTAINS information from all ATTAINS organizations in prior ATTAINS assessment cycles as individual rows for each organization. If "NULL" is selected all unique prior ATTAINS information from any ATTAINS organizations are returned but are not labeled and can be manually edited. Enter
rExpertQuery::EQ_DomainValues("org_id")into the console to get a list of valid organization identifiers. A list of organization identifiers can also be found by downloading the ATTAINS Domains Excel file: https://www.epa.gov/system/files/other-files/2025-02/domains_2025-02-25.xlsx. Organization identifiers are listed in the "code" column of the "OrgName" tab.- paramRef
A data frame which contains a completed crosswalk between TADA_ComparableDataIdentifier(s) and ATTAINS.ParameterName(s). This data frame must contain at least these two column names: TADA.ComparableDataIdentifier and ATTAINS.ParameterName. Users who are interested in performing analyses for more than one organization (multiple states and/or tribes) also need to include an additional column name: 'ATTAINS.OrganizationIdentifier'.
- usesRef
A data frame which contains a completed crosswalk of ATTAINS.ParameterName(s) that will be analyzed for each ATTAINS.UseName. Users will need to ensure this crosswalk contains the appropriate column names in order to run the function. Users who have previously completed this crosswalk table can re-use it and review this output for accuracy.
- AU_UsesRef
An optional data frame input. If provided, the ATTAINS.UseName will be populated from the ATTAINS.UseName found in this data frame rather than the ATTAINS assessment profile. This data frame must contain the following column names which can be generated from the output of TADA_AssignUsesToAU: ATTAINS.OrganizationIdentifier, ATTAINS.AssessmentUnitIdentifier, ATTAINS.UseName, and ATTAINS.WaterType.
- AUMLRef
An optional data frame input. If provided, this data frame should contain a completed crosswalk of monitoring location sites associated with an assessment unit. This data frame must contain the following column names which can be generated from the output of TADA_CreateAUMLCrosswalk: ATTAINS.OrganizationIdentifier, TADA.MonitoringLocationIdentifier, ATTAINS.AssessmentUnitIdentifier, and ATTAINS.WaterType.
- auto_assign
A boolean value. If TRUE, this will assign all unique ATTAINS.UseName to an ATTAINS.ParameterName if that parameter has not been included in prior ATTAINS assessment cycles for that ATTAINS.OrganizationIdentifier. If FALSE, the value for ATTAINS.UseName will be left blank for that ATTAINS.ParameterName and you will need to manually assign the use names as needed.
- excel
A Boolean value that returns an excel spreadsheet if excel = TRUE. This spreadsheet is created in the user's downloads folder path. If you have any trouble locating the file, please type the following into your R console to locate it: file.path(Sys.getenv("USERPROFILE"), "Downloads"). The file will be named "myfileRef.xlsx". The excel spreadsheet will highlight the cells in which users should input information.
- overwrite
A Boolean value. If overwrite = TRUE, the excel file will be replaced (overwritten) by the new file you create if you re-run this function. Users should only specify overwrite = TRUE once they are ready to re-run this function if they have already ran it once.
Value
A dataframe which contains the columns: TADA.ComparableDataIdentifier, ATTAINS.OrganizationIdentifier, ATTAINS.ParameterName, and ATTAINS.FlagUseName. Users will need to review the crosswalk between ATTAINS.ParameterName, ATTAINS.UseName and TADA.ComparableDataIdentifier.
Details
Before running this function, users must run TADA_ParametersForAnalysis() to create the crosswalk that defines the ATTAINS.ParameterName(s) needing validation. All unique ATTAINS.UseNames from prior ATTAINS assessment cycles are pulled in using ATTAINS Expert Query in this function. If a user has defined multiple TADA.ComparableDataIdentifier matches to an ATTAINS.ParameterName, they will need to define whether every TADA.ComparableDataIdentifier matches to an associated ATTAINS.UseName. If certain parameter and use combinations only apply to certain TADA.ComparableDataIdentifier(s), users will need to select 'Exclude' or select a blank value for the ATTAINS.UseName to properly capture this logic.
If an ATTAINS use name is not listed as a prior domain value for your organization from prior ATTAINS assessment cycles, users can contact the ATTAINS helpdesk attains@epa.gov to inquire about adding the use to the ATTAINS domain list. However, even when these new uses are submitted to ATTAINS, they cannot be retrieved from ATTAINS assessment profiles until the current/new assessment cycle is approved.
Thus, if a user has a list of new use names that cannot be pulled from ATTAINS, they should consider using the AU_UsesRef argument input or the usesRef argument input which would specify that the use names should come from a user supplied list rather than from prior ATTAINS assessment cycles. If a list of use names come from the AU_UsesRef, this function will apply any new use names to an ATTAINS parameter name, found in your paramRef argument input, by joining the ATTAINS.WaterType of the AUs defined in your AU_UsesRef to the ATTAINS.WaterType found from ATTAINS Expert Query.
Otherwise, users can still proceed by overriding the data validation by value pasting in Excel. Users will be warned in the ATTAINS.FlagUseName column if they choose to include an ATTAINS use name that was not listed in prior ATTAINS assessment cycles as: 'Use name has not been assessed in prior cycles by this organization' or 'Use name has been assessed in prior cycles by this organization, but not for this parameter name'.
Examples
# First, generate and fill out a parameter crosswalk (see TADA_ParametersForAnalysis()):
paramRef_UT <- TADA_ParametersForAnalysis(Data_Nutrients_UT, org_id = "UTAHDWQ", excel = FALSE)
paramRef_UT2 <- dplyr::mutate(paramRef_UT, ATTAINS.ParameterName = dplyr::case_when(
grepl("AMMONIA", TADA.ComparableDataIdentifier) ~ "AMMONIA, TOTAL",
grepl("NITRATE", TADA.ComparableDataIdentifier) ~ "NITRATE",
grepl("NITROGEN", TADA.ComparableDataIdentifier) ~ "NITRATE/NITRITE (NITRITE + NITRATE AS N)"
))
paramRef_UT3 <- TADA_ParametersForAnalysis(
Data_Nutrients_UT,
paramRef = paramRef_UT2, org_id = "UTAHDWQ", excel = FALSE
)
paramRef_UT4 <- TADA_ParametersForAnalysis(
Data_Nutrients_UT,
org_id = "UTAHDWQ", auto_assign = "All", excel = FALSE
)
#> [1] "auto_assign == 'All' was selected, finding an exact ATTAINS.ParameterName match for each TADA.ComparableDataIdentifier - by WQP CharacteristicName if one is found."
# Next, enter the crosswalk generated above as the paramRef function input
# for TADA_UsesForAnalysis():
usesRef_UT <- TADA_UsesForAnalysis(
Data_Nutrients_UT,
paramRef = paramRef_UT3, org_id = c("UTAHDWQ"), excel = FALSE
)
# Now, let's compare the crosswalk for paramRef_UT4 when we use auto_assign = "All".
# Notice, there are NA values for ATTAINS.UseName as these UT ATTAINS Parameter Name were
# not listed as a cause in prior ATTAINS assessment cycles.
usesRef_UT2 <- TADA_UsesForAnalysis(
Data_Nutrients_UT,
paramRef = paramRef_UT4, org_id = c("UTAHDWQ"), excel = FALSE
)
# Let's test the "auto_assign" input
usesRef_UT3 <- TADA_UsesForAnalysis(
Data_Nutrients_UT,
paramRef = paramRef_UT4, auto_assign = TRUE, org_id = c("UTAHDWQ"), excel = FALSE
)
#> [1] "auto_assign == TRUE was selected, assigning all unique ATTAINS.UseName, by ATTAINS.OrganizationIdentifier, to any ATTAINS.ParameterName that an organization have not done assessments for in prior ATTAINS cycle. Please review carefully and Exclude rows as needed."