Answers questions like how many sites have certain indicators >2x the state avg?

This function provides tables of summary stats but also text that explains those findings in plain English. It relies on colcounter_summary_all()

Usage

count_sites_with_n_high_scores(
  scores,
  thresholds = NULL,
  indicator_type = c("percentile", "ratio", "other")[2],
  xwide = c("", "statewide", "nationwide")[1],
  site_stat_type = c("cum_pct", "cum_count", "pct", "count")[1],
  text_suffix = c("th percentile", " times the average", "")[match(indicator_type,
    c("percentile", "ratio", "other"))],
  text_indicatortype = "indicators",
  quiet = !interactive()
)

Arguments

scores: scores in a table with one row per place and one column per indicator
thresholds: thresholds vector of numbers as benchmarks. Assuming the indicators in the scores table are ratios to the average, then the thresholds could be for example, 1.50, 2, etc. which would represent ratios that are 1.5x or 2x etc.
indicator_type: Scores and benchmarks are percentiles, ratios, or anything else. One of these: c("percentile", "ratio", "other")
xwide: must be "statewide" or "nationwide" – used only in the text output that describes the findings.
site_stat_type: Count or share of sites, exactly or cumulatively. One of these: c("cum_pct", "cum_count", "pct", "count")
text_suffix: If using ratios, use the default, which explains these thresholds as X times the average. If using percentiles as thresholds, set text_suffix = "." or text_suffix = "th percentile in the state." for example
text_indicatortype: can be "Summary Indexes" etc.
quiet: whether to print findings to console

Value

Returns a list with two named elements, "stats" and "text" where stats is a 3-dimensional array of numbers. See dimnames(output$stats).

Details

Helps provide stat summaries such as:

(x%) of these (sites) have

at least (N) of these (YTYPE )indicators

at least (R) times the (State/National average)

Examples

# out <- ejamit(testpoints_100, radius = 1)
out <- testoutput_ejamit_1000pts_1miles 
x <- out$results_bysite
x <- setDF(copy(x))
ratio_data <- x[, names_d_ratio_to_state_avg]
ratio_benchmarks <- c(2, 3, 5, 10)

findings <- count_sites_with_n_high_scores(ratio_data, ratio_benchmarks)

names(findings)
dim(findings$text)
#   see most striking stat only
tail(findings$text[findings$text != ""], 1) # the most extreme finding
findings$text[findings$text != ""]

dimnames(findings$stats) # count, cut, stat
## stat can be count, cum, pct, or cum_pct

findings$stats[1,,] # any of the indicators (at least one indicator)
findings$stats[,,"count"]
findings$stats[ , , 1]
findings$stats[ , 1, ]

pctile_data <- testoutput_ejamit_1000pts_1miles$results_bysite
pctile_data <- pctile_data[, ..names_ej_state_pctile]
pctile_benchmarks <- 90
y <- count_sites_with_n_high_scores(pctile_data, pctile_benchmarks,
  indicator_type = "percentile",
  text_indicatortype = "Summary Indexes", xwide = "statewide" 
  )

# At how many sites is at least one of these indicators at least 90th pctile nationwide?
sum(rowMaxs2(data.frame(pctile_data)) >= 90, na.rm = T)
## or 
count_sites_with_n_high_scores(pctile_data, 90, 
  xwide = "nationwide", quiet = T)$stats[count = "1", cut = "90", stat = "cum"] 

# see most striking stat only
mx <- count_sites_with_n_high_scores(pctile_data, 
  thresholds = 1:100, quiet = TRUE,
  text_indicatortype = "Summary Indexes",
  text_suffix = "th percentile in the state.")
tail(mx$text[mx$text != ""], 1) # the most extreme finding