Skip to contents

Note: This article is a work in progress

EXAMPLES OF FILES & TEST DATA EJAM CAN IMPORT OR OUTPUT

Sample spreadsheets & shapefiles for trying the web app

Examples of .xlsx files and shapefiles are installed locally with EJAM, as input files you can use to try out EJAM functions or the web app, or to see what an input file should look like.

Files and Datasets Installed with EJAM

For just one topic you can see all files and data objects like this:


topic = "fips"  # or "shape" or "latlon" or "naics" or "address" etc.


# datasets / R objects
cbind(data.in.package  = sort(grep(topic, EJAM:::datapack()$Item, value = T)))
#> Get more info with datapack(simple = FALSE)
#> 
#> ignoring sortbysize because simple=TRUE
#>      data.in.package             
#> [1,] "testinput_fips_blockgroups"
#> [2,] "testinput_fips_cities"     
#> [3,] "testinput_fips_counties"   
#> [4,] "testinput_fips_states"     
#> [5,] "testinput_fips_tracts"

# files
cbind(files.in.package = sort(basename(testdata(topic, quiet = T))))
#>       files.in.package                   
#>  [1,] "cities_2.xlsx"                    
#>  [2,] "counties_in_AL_detailed.xlsx"     
#>  [3,] "counties_in_Alabama.xlsx"         
#>  [4,] "counties_in_Delaware.xlsx"        
#>  [5,] "counties_in_Delaware_invalid.xlsx"
#>  [6,] "county_10.xlsx"                   
#>  [7,] "county_100.xlsx"                  
#>  [8,] "county_1000.xlsx"                 
#>  [9,] "county_state_300.xlsx"            
#> [10,] "fips"                             
#> [11,] "state_10.xlsx"                    
#> [12,] "state_50.xlsx"                    
#> [13,] "state_county_tract_10.xlsx"       
#> [14,] "tract_10.csv"                     
#> [15,] "tract_100.csv"                    
#> [16,] "tract_1000.csv"                   
#> [17,] "tract_state_285.xlsx"

Local folders with sample files

The best, simplest way to see all these files is the function called testdata()


testdata()

# just shapefile examples:
 testdata('shape', quiet = TRUE)

You can try uploading these kinds of files in the web app, for example, by finding them in these local folders where you installed the package:

  • /EJAM/testdata/latlon/testpoints_100.xlsx
  • /EJAM/testdata/shapes/portland_shp.zip
  • etc.

To open the locally installed “testdata” folders (in Windows File Explorer, or MacOS Finder)

Example of using a file in EJAM

testpoint_files <- list.files(
  system.file("testdata/latlon", package = "EJAM"), 
  full.names = T
  )
testpoint_files

latlon_from_anything(testpoint_files[2]) 

Sample R data objects: Examples of inputs & outputs of EJAM functions

The package has a number of data objects, installed as part of EJAM and related packages, that are examples of inputs or intermediate data objects that you can use to try out EJAM functions, or you may just want to see what the outputs and inputs look like, or you could use them for testing purposes.

For documentation on each input or output item (R object), see https://usepa.github.io/EJAM/reference/index.html#test-data

This code snippet provides a useful list of test/ sample data objects in EJAM and related packages:

POINT DATA (LAT/LON COORDINATES) for testing ejamit(), mapfast(), ejscreenit(), getblocksnearby(), etc.

See all files and all dataset examples related to one topic:

topic = "fips"
cbind(data.in.package  = sort(grep(topic, EJAM:::datapack()$Item, value = T)))
cbind(files.in.package = sort(basename(testdata(topic, quiet = T))))
x <- EJAM:::datapack(simple = FALSE)
x <- x[order(x$Package, x$Item), !grepl("size", names(x))]
x[grepl("^testp", x$Item), ]
#>     Package                Item
#> 12     EJAM       testpoints_10
#> 131    EJAM      testpoints_100
#> 132    EJAM   testpoints_100_dt
#> 152    EJAM     testpoints_1000
#> 166    EJAM    testpoints_10000
#> 120    EJAM        testpoints_5
#> 129    EJAM       testpoints_50
#> 147    EJAM      testpoints_500
#> 121    EJAM      testpoints_bad
#> 113    EJAM testpoints_overlap3
#>                                                        Title
#> 12  test points data.frame with columns sitenumber, lat, lon
#> 131 test points data.frame with columns sitenumber, lat, lon
#> 132 test points data.frame with columns sitenumber, lat, lon
#> 152 test points data.frame with columns sitenumber, lat, lon
#> 166 test points data.frame with columns sitenumber, lat, lon
#> 120 test points data.frame with columns sitenumber, lat, lon
#> 129 test points data.frame with columns sitenumber, lat, lon
#> 147 test points data.frame with columns sitenumber, lat, lon
#> 121       test points data.frame with columns note, lat, lon
#> 113       test points data.frame with columns note, lat, lon

STREET ADDRESSES for testing geocoding in latlon_from_address() etc.

x[grepl("^test_", x$Item), ]
#> [1] Package Item    Title  
#> <0 rows> (or 0-length row.names)
cat("\n\n")

FACILITY REGISTRY IDs for testing latlon_from_regid() etc.

x[grepl("^test[^op_]", x$Item), ]
#>     Package                              Item
#> 44     EJAM               testinput_address_2
#> 109    EJAM               testinput_address_9
#> 110    EJAM           testinput_address_parts
#> 117    EJAM           testinput_address_table
#> 130    EJAM         testinput_address_table_9
#> 118    EJAM testinput_address_table_goodnames
#> 119    EJAM  testinput_address_table_withfull
#> 111    EJAM        testinput_fips_blockgroups
#> 45     EJAM             testinput_fips_cities
#> 46     EJAM           testinput_fips_counties
#> 47     EJAM             testinput_fips_states
#> 112    EJAM             testinput_fips_tracts
#> 48     EJAM                    testinput_mact
#> 49     EJAM                   testinput_naics
#> 50     EJAM            testinput_program_name
#> 11     EJAM          testinput_program_sys_id
#> 3      EJAM                   testinput_regid
#> 4      EJAM             testinput_registry_id
#> 51     EJAM                     testinput_sic
#> 5      EJAM                   testinput_xtrac
#> 135    EJAM                      testshapes_2
#>                                                                       Title
#> 44                            datasets for trying address-related functions
#> 109                           datasets for trying address-related functions
#> 110                           datasets for trying address-related functions
#> 117                           datasets for trying address-related functions
#> 130                           datasets for trying address-related functions
#> 118                           datasets for trying address-related functions
#> 119                           datasets for trying address-related functions
#> 111                                      testinput_fips_blockgroups dataset
#> 45                                            testinput_fips_cities dataset
#> 46                                          testinput_fips_counties dataset
#> 47                                            testinput_fips_states dataset
#> 112                                           testinput_fips_tracts dataset
#> 48                                                   testinput_mact dataset
#> 49                                                  testinput_naics dataset
#> 50                                           testinput_program_name dataset
#> 11  test data, EPA program names and program system ID numbers to try using
#> 3   testinput_regid (DATA) test data, vector of EPA FRS Registry ID numbers
#> 4                  test data, EPA Facility Registry ID numbers to try using
#> 51                                                    testinput_sic dataset
#> 5                                                          for internal use
#> 135                                                    testshapes_2 dataset
cat("\n\n")

EXAMPLES OF OUTPUTS from ejamit(), ejscreenit(), getblocksnearby(), etc., you can use as inputs to ejam2report(), ejam2excel(), ejam2ratios(), ejam2barplot(), doaggregate(), etc.

x[grepl("^testout", x$Item), ]
#>     Package                                      Item
#> 175    EJAM     testoutput_doaggregate_1000pts_1miles
#> 167    EJAM      testoutput_doaggregate_100pts_1miles
#> 159    EJAM       testoutput_doaggregate_10pts_1miles
#> 176    EJAM          testoutput_ejamit_1000pts_1miles
#> 168    EJAM           testoutput_ejamit_100pts_1miles
#> 165    EJAM            testoutput_ejamit_10pts_1miles
#> 143    EJAM        testoutput_ejscreenapi_1pts_1miles
#> 150    EJAM             testoutput_ejscreenapi_plus_5
#> 155    EJAM                   testoutput_ejscreenit_5
#> 161    EJAM                  testoutput_ejscreenit_50
#> 172    EJAM                 testoutput_ejscreenit_500
#> 137    EJAM testoutput_ejscreenRESTbroker_1pts_1miles
#> 171    EJAM testoutput_getblocksnearby_1000pts_1miles
#> 158    EJAM  testoutput_getblocksnearby_100pts_1miles
#> 149    EJAM   testoutput_getblocksnearby_10pts_1miles
#>                                                                                     Title
#> 175                                                          test output of doaggregate()
#> 167                                                          test output of doaggregate()
#> 159                                                          test output of doaggregate()
#> 176                                                               test output of ejamit()
#> 168                                                               test output of ejamit()
#> 165                                                               test output of ejamit()
#> 143                                                  test data, output from this function
#> 150 test data examples of output from 'ejscreenapi_plus()' using testpoints_5, radius = 1
#> 155       test data examples of output from 'ejscreenit()' using testpoints_5, radius = 1
#> 161      test data examples of output from 'ejscreenit()' using testpoints_50, radius = 1
#> 172     test data examples of output from 'ejscreenit()' using testpoints_500, radius = 1
#> 137                                                  test data, output from this function
#> 171                    test output of getblocksnearby(), and is an input to doaggregate()
#> 158                    test output of getblocksnearby(), and is an input to doaggregate()
#> 149                    test output of getblocksnearby(), and is an input to doaggregate()
cat("\n\n")

LARGE DATASETS USED BY THE PACKAGE

Note that the largest files used by the package are mostly the block-related datasets with info about population size and location of US blocks, the facility datasets with info about EPA-regulated sites, and the blockgroup-related datasets with EJScreen indicators.

Some datasets get downloaded by the package at installation or launch or as needed. Datasets may include for example: [blockwts], [blockpoints], [quaddata], [blockid2fips], [frs], [frs_by_programid], [frs_by_naics], [frs_by_sic], and [frs_by_mact].

Blockgroup-related datasets incude [blockgroupstats], [bgpts], [bgej], [usastats], [statestats], [bgid2fips], and [bg_cenpop2020]. For more info see https://usepa.github.io/EJAM/reference/index.html#datasets-with-indicators-etc-