Skip to contents

Create a comparable identifier by concatenating:

  • TADA.CharacteristicName

  • TADA.ResultSampleFractionText

  • TADA.MethodSpeciationName

  • TADA.ResultMeasure.MeasureUnitCode

Usage

TADA_CreateComparableID(.data)

Arguments

.data

A TADA dataframe (data.frame or tibble) with the required columns: TADA.CharacteristicName, TADA.ResultSampleFractionText, TADA.MethodSpeciationName, and TADA.ResultMeasure.MeasureUnitCode.

Value

The input dataframe with:

  • harmonized fields (fraction/speciation/unit) where missing/"none" -> "NONE"

  • a character column TADA.ComparableDataIdentifier

Details

Harmonization:

  • TADA.ResultSampleFractionText, TADA.MethodSpeciationName, and TADA.ResultMeasure.MeasureUnitCode are first normalized so any blank, NULL/NA, or any case variant of "none" are set to the literal "NONE".

Identifier construction:

  • Each component is trimmed. For the characteristic name only, blanks/NA are converted to the literal "NA". For fraction/speciation/unit, the normalized values are used (i.e., "NONE" where missing).

  • Example: "DISSOLVED OXYGEN (DO)_NONE_NONE_MG/L"

Examples

df <- data.frame(
  TADA.CharacteristicName = c("DISSOLVED OXYGEN (DO)", "pH", "Nitrate"),
  TADA.ResultSampleFractionText = c("", NA, "Dissolved"),
  TADA.MethodSpeciationName = c(" ", NA, ""),
  TADA.ResultMeasure.MeasureUnitCode = c("MG/L", "none", NA),
  stringsAsFactors = FALSE
)

out <- TADA_CreateComparableID(df)
out$TADA.ComparableDataIdentifier
#> [1] "DISSOLVED OXYGEN (DO)_NONE_NONE_MG/L"
#> [2] "pH_NONE_NONE_NONE"                   
#> [3] "Nitrate_Dissolved_NONE_NONE"         
# Expected:
# [1] "DISSOLVED OXYGEN (DO)_NONE_NONE_MG/L"
# [2] "pH_NONE_NONE_NONE"
# [3] "Nitrate_Dissolved_NONE_NONE"