Skip to contents

EJAM/EJSCREEN comparisons - see summary stats after using ejscreen_vs_ejam()

Usage

ejscreen_vs_ejam_summary(
  vs = NULL,
  myvars = colnames(vs$EJAM),
  tol = 0.05,
  prob = 0.95,
  na.rm = TRUE
)

Arguments

vs

output of ejscreen_vs_ejam()

myvars

optional to check just a subset of the colnames found in vs$EJAM and vs$EJSCREEN, such as these possible values:

myvars = "all" # all the indicators in the output tables, i.e., colnames(vs$EJAM)

myvars = "inboth" # just the ones in both (not NA values because EJAM or EJSCREEN did not report the indicator)

myvars = "bad" # just the ones in both where EJAM_shown and EJSCREEN_shown disagree

myvars = c(names_d, names_d_subgroups) or

myvars = grep("pctile", colnames(vs$EJAM), value = T)

tol

optional, set this so that results can be said to agree with this tolerance if they differ by less than tol percent where tol is expressed as a fraction 0 to 1.

prob

optional fraction of 1 representing percentile p to check for absolute percentage differences. See within.x.pct.at.p.pct.of.sites value that is returned.

na.rm

needs testing, optional

Value

A data.frame of summary stats showing counts and percents of analyzed sites (or those with valid data that are found in both EJAM and EJSCREEN outputs), indicating how many of the sites agree between EJSCREEN and EJAM estimates, exactly as reported or within some tolerance. Columns include

"indicator" (variable name)

"sites.with.data.ejam" How many of the sites had a value from EJAM for the given indicator?

"sites.with.data.neither" How many sites had NA from both EJAM and EJSCREEN?

"sites.with.data.both"

"sites.agree.rounded" How many sites agree (EJSCREEN vs EJAM) in the value shown on reports? i.e., the reported, rounded value.

"sites.agree.within.tol" How many sites agree within tol? (i.e., with tol x 100 percent)

"pct.of.sites.agree.rounded" as a percent 0-100% of sites with data

"pct.of.sites.agree.within.tol" as a percent 0-100% of sites with data

"median.abs.diff" Median over sites with data, of the absolute differences, EJAM - EJSCREEN

"max.abs.diff"

"mean.pct.diff" Percent difference 0-100% is absolute value of 100*(ratio - 1), and ratio is EJAM/EJSCREEN

"median.pct.diff" 0-100%

"max.pct.diff" 0-100%

"within.x.pct.at.p.pct.of.sites" X, where EJAM and EJSCREEN agree within X percent 0-100% or better at prob share of sites. Prob share as used in this last stat should mean prob (e.g. 0.95) share of sites have an absolute percentage difference in estimated indicator values that is less than or equal to x where x is one of the actual values of abspctdiff found * 100. It uses 100 * quantile(y, probs = prob, type = 1)

Examples

  radius = 3
  if (FALSE) { # \dontrun{
  pts <- testpoints_n(100, weighting = 'frs')
  
  # This step can take a long time, almost 1 minute per 20 points, as it uses the EJScreen API:
  vs100 <- ejscreen_vs_ejam(pts, radius = radius, include_ejindexes = TRUE)
  
  ejscreen_vs_ejam_see1(vs100, mysite = 1)
  
  # see site with largest % disagreement:
  ejscreen_vs_ejam_see1(vs, 'pop', mysite = which.max(vs$abspctdiff$pop))
  
  vs100$diff$blockcount_near_site
  sum100 <- ejscreen_vs_ejam_summary(vs100, tol = 0.01)
  s100 <- sum100[ , c(1, 6:12)]
  
  s100[s100$indicator %in% names_e, ]
  s100[s100$indicator %in% names_d, ]
  s100[s100$indicator %in% names_these, ]
  s100[s100$indicator %in% c(names_ej_pctile, names_ej_state_pctile, names_ej_supp_pctile, names_ej_supp_state_pctile), ]
  
  sum100_within5pct <- ejscreen_vs_ejam_summary(vs100, tol = 0.05)
  sum100_within5pct[sum100_within5pct$indicator %in% names_these, ][ , c(1, 6:12)]
  
  ## longer analysis (45 minutes perhaps)
  # This step can take a long time, almost 1 minute per 20 points, as it uses the EJScreen API:
  # pts <- testpoints_n(1000, weighting = 'frs')
  # vs1000pts3miles <- ejscreen_vs_ejam(pts, radius = 3, include_ejindexes = TRUE)
  # sum_vs1000pts3miles <- ejscreen_vs_ejam_summary(vs1000pts3miles)
  
  } # }