Create a comparable identifier by concatenating:
TADA.CharacteristicName
TADA.ResultSampleFractionText
TADA.MethodSpeciationName
TADA.ResultMeasure.MeasureUnitCode
Value
The input dataframe with:
harmonized fields (fraction/speciation/unit) where missing/"none" -> "NONE"
a character column TADA.ComparableDataIdentifier
Details
Harmonization:
TADA.ResultSampleFractionText, TADA.MethodSpeciationName, and TADA.ResultMeasure.MeasureUnitCode are first normalized so any blank, NULL/NA, or any case variant of "none" are set to the literal "NONE".
Identifier construction:
Each component is trimmed. For the characteristic name only, blanks/NA are converted to the literal "NA". For fraction/speciation/unit, the normalized values are used (i.e., "NONE" where missing).
Example: "DISSOLVED OXYGEN (DO)_NONE_NONE_MG/L"
Examples
df <- data.frame(
TADA.CharacteristicName = c("DISSOLVED OXYGEN (DO)", "pH", "Nitrate"),
TADA.ResultSampleFractionText = c("", NA, "Dissolved"),
TADA.MethodSpeciationName = c(" ", NA, ""),
TADA.ResultMeasure.MeasureUnitCode = c("MG/L", "none", NA),
stringsAsFactors = FALSE
)
out <- TADA_CreateComparableID(df)
out$TADA.ComparableDataIdentifier
#> [1] "DISSOLVED OXYGEN (DO)_NONE_NONE_MG/L"
#> [2] "pH_NONE_NONE_NONE"
#> [3] "Nitrate_Dissolved_NONE_NONE"
# Expected:
# [1] "DISSOLVED OXYGEN (DO)_NONE_NONE_MG/L"
# [2] "pH_NONE_NONE_NONE"
# [3] "Nitrate_Dissolved_NONE_NONE"