helper function to rename variables that are colnames of data.frame
Source:R/fixcolnames.R
fixcolnames.Rd
Changes variable names like colnames to long plain-English headers or short labels for plots
Arguments
- namesnow
vector of colnames to be renamed
- oldtype
"longname" or "shortname", or "csv" or "r" or "api", etc. or a colname of map_headernames, used if one of those known types was not specified.
- newtype
"longname" or "shortname", or "csv" or "r" or "api", etc. or a colname of map_headernames, used if one of those known types was not specified.
- mapping_for_names
default is a dataset already in the package.
Details
You specify an alias of a type like "api", "r", "long", or "short",
or one of colnames(map_headernames)
like "rname", "vartype", "decimals", "varlist", etc.
Also, you can use this to extract any info from map_headernames
(which
here is called mapping_for_names).
NOTE: if you ask to rename your words to a known type like rname or apiname, and the namesnow is not found among the oldtype, then it is not renamed, and those are returned as unchanged. BUT, if you specify as newtype some column that is not a known type of name, like "varcategory" then it will instead return an empty string for those in namesnow that are not found among the oldtype. That way if you are really seeking a new name, but it cannot rename, it keeps the old name while if you are really seeking metadata like what category it is in, it returns a blank if the old name is not found at all.
These are some key column names in the map_headernames table:
"shortname" (aka "short", for plot labels, etc.)
"longname" (aka "long", for full explanatory headers to use on a table)
"rname" (aka "r", the R variable names as used in the EJAM code)
"apiname" (aka "api", as returned by EJScreen API)
"csvname" (aka "csv", as found in the CSV files of just the key residential population and environmental indicators, found on the EJScreen FTP site)
"acsname" (aka "acs", as found in a ACS data file internally used by EJScreen, containing all the extra residential population groups and other indicators not stored in the CSV files on the EJScreen FTP site)
"DEJ" (whether the indicator is residential population, environmental, etc.)
"varlist" (which group of names is this variable in, such as "names_d", "names_d_subgroups", "names_d_state_pctile", etc.)
"calculation_type" (how it should be aggregated over block groups, such as "wtdmean", "sum of counts", etc.)
"denominator" (the weight to use in aggregating as a wtdmean, normally a count variable that is the universe for a percentage, such as "pop", "hhlds", etc.)
Examples
# see package tests
names_d
#> [1] "Demog.Index" "Demog.Index.Supp" "pctlowinc" "pctlingiso"
#> [5] "pctunemployed" "pctlths" "pctunder5" "pctover64"
#> [9] "pctmin"
#> attr(,"ejam_package_version")
#> Version
#> "2.32.0"
#> attr(,"ejscreen_version")
#> EJScreenVersion
#> "2.32"
#> attr(,"ejscreen_releasedate")
#> EJScreenReleaseDate
#> "2024-08-12"
#> attr(,"acs_releasedate")
#> ACSReleaseDate
#> "2023-12-07"
#> attr(,"acs_version")
#> ACSVersion
#> "2018-2022"
#> attr(,"census_version")
#> CensusVersion
#> "2020"
#> attr(,"date_saved_in_package")
#> [1] "2025-02-05"
namesbyvarlist('names_d')
#> varlist rname
#> 1 names_d Demog.Index
#> 2 names_d Demog.Index.Supp
#> 3 names_d pctlowinc
#> 4 names_d pctlingiso
#> 5 names_d pctunemployed
#> 6 names_d pctlths
#> 7 names_d pctunder5
#> 8 names_d pctover64
#> 9 names_d pctmin
x = varinfo("pctlowinc")
x = varinfo("pcthisp")
# see the different names for the same variable, and see it is not in the csv tables on the FTP site
varinfo("pcthisp", c("csvname", "acsname", "apiname"))
#> csvname acsname apiname
#> pcthisp PCT_HISP P_HISP
# EJAM:::names_whichlist("RAW_D_INCOME")
fixcolnames(c("RAW_D_INCOME", "S_D_LIFEEXP"), 'api')
#> [1] "pctlowinc" "state.avg.lowlifex"
fixcolnames('LOWINCPCT', 'csv')
#> [1] "pctlowinc"
fixcolnames(c("PCT_HISP", "HISP"), 'acs')
#> [1] "pcthisp" "hisp"
fixcolnames(c("RAW_D_INCOME", "S_D_LIFEEXP"), newtype = "longname")
#> [1] "RAW_D_INCOME" "S_D_LIFEEXP"
addmargins(table(map_headernames$vartype, map_headernames$DEJ))
#>
#> Demographic E other EJ Index Environmental geo Sum
#> 2 0 0 0 0 2
#> geo 0 0 0 0 18 18
#> raw 135 18 2 14 7 176
#> stateavg 41 4 0 13 0 58
#> statepctile 42 4 26 13 0 85
#> stateratio 35 0 0 13 0 48
#> stateraw 0 0 28 0 0 28
#> usavg 41 4 0 13 0 58
#> uspctile 41 4 26 13 0 84
#> usratio 35 0 0 13 0 48
#> usraw 0 0 28 0 0 28
#> Sum 372 34 110 92 25 633
# the columns "newsort" and "reportsort" provide useful sort orders
x <- map_headernames$rname[map_headernames$varlist == "names_d"]
# same as
print("original order"); print(x)
#> [1] "original order"
#> [1] "Demog.Index" "Demog.Index.Supp" "pctlowinc" "pctlingiso"
#> [5] "pctunemployed" "pctlths" "pctunder5" "pctover64"
#> [9] "pctmin"
x <- sample(x, length(x), replace = FALSE)
print("out of order"); print(x)
#> [1] "out of order"
#> [1] "pctunder5" "Demog.Index" "pctlths" "pctover64"
#> [5] "Demog.Index.Supp" "pctunemployed" "pctlingiso" "pctmin"
#> [9] "pctlowinc"
print("fixed order")
#> [1] "fixed order"
x[ order(fixcolnames(x, oldtype = "r", newtype = "newsort")) ]
#> [1] "Demog.Index" "Demog.Index.Supp" "pctlowinc" "pctlingiso"
#> [5] "pctunemployed" "pctlths" "pctunder5" "pctover64"
#> [9] "pctmin"