Create or Update ATTAINS Parameter and Use crosswalk
Source:R/ATTAINSCrosswalks.R
TADA_CreateUseParamRef.Rd
This function generates a crosswalk of all parameters and uses applicable to the selected organization(s) in ATTAINS. Users should review and validate each ATTAINS.ParameterName and associated ATTAINS.UseName combination. As part of this review process, users should check to make sure each 'ATTAINS.UseName' from the drop-down menu in the excel spreadsheet generated by this function also accurately corresponds to the correct TADA.ComparableDataIdentifier and ATTAINS.ParameterName found in the TADA dataframe. This function should be run after creating a parameter (ATTAINS.ParameterName and TADA.ComparableDataIdentifier) crosswalk.
Usage
TADA_CreateUseParamRef(
.data,
org_id = NULL,
paramRef = NULL,
useParamRef = NULL,
auto_assign = FALSE,
excel = FALSE,
overwrite = FALSE
)
Arguments
- .data
A TADA dataframe. The user should run all desired data cleaning, processing, harmonization, filtering, and handling of censored data functions prior to running this function.
- org_id
The ATTAINS organization identifier must be supplied by the user. A list of organization identifiers can be found by downloading the ATTAINS Domains Excel file: https://www.epa.gov/system/files/other-files/2025-02/domains_2025-02-25.xlsx. organization identifiers are listed in the "OrgName" tab. The "code" column contains the organization identifiers that should be used for this param. If a user does not provide an org_id argument, the function attempts to identify which organization identifier(s) to include based on the unique ATTAINS organization identifiers found in the dataframe.
- paramRef
A dataframe which contains a completed crosswalk between TADA_ComparableDataIdentifier and ATTAINS.ParameterName. Users will need to ensure this crosswalk contains the appropriate column names in order to run the function. paramRef must contain at least these two column names: TADA.ComparableDataIdentifier and ATTAINS.ParameterName. Users who are interested in performing analyses for more than one organization (multiple states and/or tribes) also need to include an additional column name: 'ATTAINS.OrganizationIdentifier'
- useParamRef
A dataframe which contains a completed crosswalk of organization specific ATTAINS.UseName(s) for each ATTAINS.ParameterName. Users will need to ensure this crosswalk contains the appropriate column names in order to run the function. Users who have previously completed this crosswalk table can re-use it and review this output for accuracy.
- auto_assign
NOTE: this has not been developed, will this be helpful? A boolean value. If TRUE, this will assign all unique use names to an ATTAINS.ParameterName that has not been defined by your organization from ATTAINS. If FALSE, the values will be left blank and will need you to manually assign use names as needed.
- excel
A Boolean value that returns an excel spreadsheet if excel = TRUE. This spreadsheet is created in the user's downloads folder path. If you have any trouble locating the file, please type the following into your R console to locate it: file.path(Sys.getenv("USERPROFILE"), "Downloads"). The file will be named "myfileRef.xlsx". The excel spreadsheet will highlight the cells in which users should input information.
- overwrite
A Boolean value that ensures the function will not overwrite the user supplied crosswalk entered into this function via the paramRef function input. This helps prevent users from overwriting their progress.
Value
A dataframe which contains the columns: TADA.ComparableDataIdentifier, ATTAINS.OrganizationIdentifier, ATTAINS.ParameterName, and ATTAINS.FlagUseName. Users will need to review the crosswalk between ATTAINS.ParameterName, ATTAINS.UseName and TADA.ComparableDataIdentifier.
Details
Before running this function, users must run TADA_CreateParamRef() to create the crosswalk that defines the ATTAINS.ParameterName(s) and ATTAINS.UseName(s) needing validation. All unique ATTAINS.UseNames from prior ATTAINS assessment cycles are pulled in using TADA_CreateUseParamRef(). If a user has defined multiple TADA.ComparableDataIdentifier matches to an ATTAINS.ParameterName, they will need to define whether every TADA.ComparableDataIdentifier matches to an associated ATTAINS.UseName. If certain parameter and use combinations only apply to certain TADA.ComparableDataIdentifier(s), users will need to select 'NA' or leave it as blank to properly capture this logic.
If an ATTAINS use name is not listed as a prior domain value for your organization from prior ATTAINS assessment cycles, users can contact the ATTAINS helpdesk attains@epa.gov to inquire about adding the use to the ATTAINS domain list. Otherwise, users can still proceed by overriding the data validation by value pasting in Excel. Users will be warned in the ATTAINS.FlagUseName column if they choose to include an ATTAINS use name that was not listed in prior ATTAINS assessment cycles as: 'Use name is not listed as a prior cause in ATTAINS for this organization' or 'Use name is listed as a prior cause in ATTAINS for this organization, but not for this parameter name'.
Note: Future development work will allow for crosswalking other names from the WQP such as using pollutant names from the EPA's Criteria Search Tool (CST): www.epa.gov/wqs-tech/state-specific-water-quality-standards-effective-under-clean-water-act-cwa. The TADA Team has crosswalked the CST pollutant names for EPA304a standards with TADA.ComparableDataIdentifier(s) to make the criteria values available for use within TADA functions. The ATTAINS.UseName(s) associated with the EPA304a criteria are included from the CST. All other ATTAINS.UseName(s) are specific to an ATTAINS organization and come from the ATTAINS domain value for use_name.
Examples
# First, generate and fill out a parameter crosswalk (see TADA_CreateParamRef()):
paramRef_UT <- TADA_CreateParamRef(Data_Nutrients_UT, org_id = "UTAHDWQ", excel = FALSE)
paramRef_UT2 <- dplyr::mutate(paramRef_UT, ATTAINS.ParameterName = dplyr::case_when(
grepl("AMMONIA", TADA.ComparableDataIdentifier) ~ "AMMONIA, TOTAL",
grepl("NITRATE", TADA.ComparableDataIdentifier) ~ "NITRATE",
grepl("NITROGEN", TADA.ComparableDataIdentifier) ~ "NITRATE/NITRITE (NITRITE + NITRATE AS N)"
))
paramRef_UT3 <- TADA_CreateParamRef(
Data_Nutrients_UT,
paramRef = paramRef_UT2, org_id = "UTAHDWQ", excel = FALSE
)
paramRef_UT4 <- TADA_CreateParamRef(
Data_Nutrients_UT,
org_id = "UTAHDWQ", auto_assign = "All", excel = FALSE
)
#> [1] "auto_assign == 'All' was selected, finding an exact ATTAINS.ParameterName match for each TADA.ComparableDataIdentifier - by WQP CharacteristicName if one is found."
# Next, enter the crosswalk generated above as the paramRef function input
# for TADA_CreateUseParamRef():
UseParamRef_UT <- TADA_CreateUseParamRef(
Data_Nutrients_UT,
paramRef = paramRef_UT3, org_id = c("UTAHDWQ"), excel = FALSE
)
# Now, let's compare the crosswalk for paramRef_UT4 when we use auto_assign = "All".
# Notice, there are NA values for ATTAINS.UseName as these UT ATTAINS Parameter Name were
# not listed as a cause in prior ATTAINS assessment cycles.
UseParamRef_UT2 <- TADA_CreateUseParamRef(
Data_Nutrients_UT,
paramRef = paramRef_UT4, org_id = c("UTAHDWQ"), excel = FALSE
)
# Let's test the "auto_assign" input
UseParamRef_UT3 <- TADA_CreateUseParamRef(
Data_Nutrients_UT,
paramRef = paramRef_UT4, auto_assign = TRUE, org_id = c("UTAHDWQ"), excel = FALSE
)
#> [1] "auto_assign == TRUE was selected, assigning all unique ATTAINS.UseName, by ATTAINS.OrganizationIdentifier, to any ATTAINS.ParameterName that an organization have not done assessments for in prior ATTAINS cycle. Please review carefully and Exclude rows as needed."