Answers questions like how many sites have certain indicators >2x the state avg?
Source:R/count_sites_with_n_high_scores.R
count_sites_with_n_high_scores.Rd
This function provides tables of summary stats but also text
that explains those findings in plain English. It relies on colcounter_summary_all()
Usage
count_sites_with_n_high_scores(
scores,
thresholds = NULL,
indicator_type = c("percentile", "ratio", "other")[2],
xwide = c("", "statewide", "nationwide")[1],
site_stat_type = c("cum_pct", "cum_count", "pct", "count")[1],
text_suffix = c("th percentile", " times the average", "")[match(indicator_type,
c("percentile", "ratio", "other"))],
text_indicatortype = "indicators",
quiet = !interactive()
)
Arguments
- scores
scores in a table with one row per place and one column per indicator
- thresholds
thresholds vector of numbers as benchmarks. Assuming the indicators in the scores table are ratios to the average, then the thresholds could be for example, 1.50, 2, etc. which would represent ratios that are 1.5x or 2x etc.
- indicator_type
Scores and benchmarks are percentiles, ratios, or anything else. One of these: c("percentile", "ratio", "other")
- xwide
must be "statewide" or "nationwide" – used only in the text output that describes the findings.
- site_stat_type
Count or share of sites, exactly or cumulatively. One of these: c("cum_pct", "cum_count", "pct", "count")
- text_suffix
If using ratios, use the default, which explains these thresholds as X times the average. If using percentiles as thresholds, set text_suffix = "." or text_suffix = "th percentile in the state." for example
- text_indicatortype
can be "Summary Indexes" etc.
- quiet
whether to print findings to console
Value
Returns a list with two named elements, "stats" and "text" where stats is a 3-dimensional array of numbers. See dimnames(output$stats).
Details
Helps provide stat summaries such as:
(x%) of these (sites) have
at least (N) of these (YTYPE )indicators
at least (R) times the (State/National average)
Examples
# out <- ejamit(testpoints_100, radius = 1)
out <- testoutput_ejamit_1000pts_1miles
x <- out$results_bysite
x <- setDF(copy(x))
#> Error in setDF(copy(x)): could not find function "setDF"
ratio_data <- x[, names_d_ratio_to_state_avg]
#> Error in `[.data.table`(x, , names_d_ratio_to_state_avg): j (the 2nd argument inside [...]) is a single symbol but column name 'names_d_ratio_to_state_avg' is not found. If you intended to select columns using a variable in calling scope, please try DT[, ..names_d_ratio_to_state_avg]. The .. prefix conveys one-level-up similar to a file system path.
ratio_benchmarks <- c(2, 3, 5, 10)
findings <- count_sites_with_n_high_scores(ratio_data, ratio_benchmarks)
#> Error: object 'ratio_data' not found
names(findings)
#> Error: object 'findings' not found
dim(findings$text)
#> Error: object 'findings' not found
# see most striking stat only
tail(findings$text[findings$text != ""], 1) # the most extreme finding
#> Error: object 'findings' not found
findings$text[findings$text != ""]
#> Error: object 'findings' not found
dimnames(findings$stats) # count, cut, stat
#> Error: object 'findings' not found
## stat can be count, cum, pct, or cum_pct
findings$stats[1,,] # any of the indicators (at least one indicator)
#> Error: object 'findings' not found
findings$stats[,,"count"]
#> Error: object 'findings' not found
findings$stats[ , , 1]
#> Error: object 'findings' not found
findings$stats[ , 1, ]
#> Error: object 'findings' not found
pctile_data <- testoutput_ejamit_1000pts_1miles$results_bysite
pctile_data <- pctile_data[, ..names_ej_state_pctile]
pctile_benchmarks <- 90
y <- count_sites_with_n_high_scores(pctile_data, pctile_benchmarks,
indicator_type = "percentile",
text_indicatortype = "Summary Indexes", xwide = "statewide"
)
# At how many sites is at least one of these indicators at least 90th pctile nationwide?
sum(rowMaxs2(data.frame(pctile_data)) >= 90, na.rm = T)
#> Error in rowMaxs2(data.frame(pctile_data)): could not find function "rowMaxs2"
## or
count_sites_with_n_high_scores(pctile_data, 90, xwide = "nationwide", quiet = T)$stats[count = "1", cut = "90", stat = "cum"]
#> [1] 268
# see most striking stat only
mx <- count_sites_with_n_high_scores(pctile_data,
thresholds = 1:100, quiet = TRUE,
text_indicatortype = "Summary Indexes",
text_suffix = "th percentile in the state.")
tail(mx$text[mx$text != ""], 1) # the most extreme finding
#> [1] "At at least 1% of these sites, 2 of the Summary Indexes are 99th percentile in the state. "