Random points in USA - average resident, facility, BG, block, or square mile

Get data.table of Random Points (lat lon) for Testing/ Benchmarking/ Demos, weighted in various ways. The weighting can be specified so that each point reflects the average EPA-regulated facility, blockgroup, block, place on the map, or US resident.

Usage

testpoints_n(
  n = 10,
  weighting = c("frs", "pop", "area", "bg", "block"),
  region = NULL,
  ST = NULL,
  validonly = TRUE,
  dt = TRUE
)

Arguments

n

Number of points needed (sample size)

weighting

word indicating how to weight the random points (some synonyms are allowed, in addition to those shown here):

Note the default is frs, but you may want to use pop even though it is slower.

pop or people (slow) = Average Person: random person among all US residents (block point of residence per 2020 Census)
frs or facility = Average Facility: random EPA-regulated facility from actives in Facility Registry Service (FRS)
bg = Average Blockgroup: random US Census block group (internal point like a centroid)
block = Average Block: random US Census block (internal point like a centroid)
area or place = Average Place: random point on a map (internal point of avg blockgroup weighted by its square meters size)

region

optional vector of EPA Regions (1-10) to pick from only some regions.

ST

optional, can be a character vector of 2 letter State abbreviations to pick from only some States.

validonly

return only points with valid lat/lon coordinates. Defaults to TRUE.

dt

logical, whether to return a data.table (DEFAULT) instead of normal data.frame

Value

data.frame or data.table with columns lat, lon in decimal degrees, and any other columns that are in the table used (based on weighting)

Examples

mapfast(testpoints_n(300, ST = c('LA','MS')) )
#> Including only these States:
#>   REGION ST   statename
#> 1      6 LA   Louisiana
#> 2      4 MS Mississippi
#> Warning: 1 fips had invalid number of characters (digits) or were NA values
#> Warning: NA returned for 1 values that failed to match
#> Warning: Some latitude / longitude were provided that are not found in any state

# \donttest{
n=2
for (d in c(TRUE,FALSE)) {
  for (w in c('frs', 'pop', 'area', 'bg', 'block')) {
    cat("n=",n,"  weighting=",w, "  dt=",d,"\n\n")
    print(x <- testpoints_n(n, weighting = w, dt = d)); print(class(x))
    cat('\n')
  }
}
#> n= 2   weighting= frs   dt= TRUE 
#> 
#>         lat       lon  REGISTRY_ID             PRIMARY_NAME  NAICS    SIC
#>       <num>     <num>       <char>                   <char> <char> <char>
#> 1: 37.36612 -121.8283 110065404217 NEW HORIZON CHIROPRACTIC              
#> 2: 37.89846 -122.3109 110065308731                  LG MOTO 811111       
#>                                PGM_SYS_ACRNMS sitenumber
#>                                        <char>      <int>
#> 1:                        CA-ENVIROVIEW:51626          1
#> 2: CA-ENVIROVIEW:44336, RCRAINFO:CAL000399798          2
#> [1] "data.table" "data.frame"
#> 
#> n= 2   weighting= pop   dt= TRUE 
#> 
#>    blockid      lat       lon   bgid    blockwt block_radius_miles sitenumber
#>      <int>    <num>     <num>  <int>      <num>              <num>      <int>
#> 1: 5392390 39.34946  -84.3870 165668 0.60324617                  0          1
#> 2:  686238 33.93436 -118.3371  21033 0.07940781                  0          2
#> [1] "data.table" "data.frame"
#> 
#> n= 2   weighting= area   dt= TRUE 
#> 
#> Error in sample.int(blockgroupstats[, .N], size = n, replace = FALSE,     prob = blockgroupstats$area): length(n) == 1L is not TRUE
# }