Find EPA-regulated facilities in FRS by NAICS code (industrial category)
Source:R/latlon_from_naics.R
latlon_from_naics.Rd
Get lat lon, Registry ID, given NAICS industry code(s) Find all EPA Facility Registry Service (FRS) sites with this/these NAICS code(s)
Arguments
- naics
a vector of naics codes or query of titles of NAICS, or a data.table with column named code, as with output of
naics_from_any()
- children
optional logical. set to FALSE to get only exact matches rather than all facilities whose NAICS starts with provided naics (or naics based on provided title). Many facilities have only a longer more specific NAICS code listed in the FRS, such as a 6-digit code, so if the category (e.g., 4-digit) is queried then without children = TRUE one would not find all the sites within that overall category.
- id_only
optional logical. Must set TRUE to get only regid instead of table
- ...
passed to
naics_from_any()
Value
A data.table (not just data.frame) with columns called lat, lon, REGISTRY_ID, NAICS, naics_found, naics_query (unless id_only parameter set TRUE). naics_query is the input parameter that was used (that had been provided to this function as naics). naics_found and NAICS are identical (redundant), and are the code found that was listed in the frs_by_naics table, so it might be a subcategory (child) of the naics_query term. For example, naics_query might be 33611 (5 digits) and for one facility the NAICS and naics_found might be 336111 (a 6-digit code) and for another facility they might be 336112.
Details
Important notes:
Finding the right NAICS and finding all the right sites by NAICS is complicated, and requires understanding the NAICS codes system, the FRS data, and the EJAM functions. See the discussion in the "Advanced" or other vignettes/articles.
Many FRS sites lack NAICS code!
Note the difference between children = TRUE and children = FALSE
The NAICS in the returned table may be a child NAICS not the NAICS used in the query! This may cause confusion if you are querying multiple parent NAICS and you want to analyze results by NAICS!
The functions like regid_from_naics()
, latlon_from_naics()
, and frs_from_naics()
try to find EPA FRS sites based on naics codes or titles.
EPA also provides a FRS Facility Industrial Classification Search tool where you can find facilities based on NAICS or SIC.
See more about NAICS industry codes at https://www.naics.com/search
Examples
# \donttest{
regid_from_naics(321114)
#> [1] "110000307837" "110000324505" "110000324809" "110000333826" "110000340667"
#> [6] "110000341540" "110000343913" "110000344137" "110000345001" "110000348286"
#> [11] "110000351869" "110000352038" "110000356203" "110000357499" "110000357961"
#> [16] "110000357989" "110000358675" "110000360724" "110000362081" "110000362330"
#> [21] "110000362624" "110000367371" "110000367807" "110000369271" "110000372971"
#> [26] "110000373060" "110000373257" "110000379028" "110000379108" "110000382185"
#> [31] "110000391683" "110000405142" "110000406720" "110000407863" "110000409601"
#> [36] "110000420857" "110000422418" "110000422720" "110000422819" "110000437750"
#> [41] "110000449426" "110000450351" "110000450431" "110000450690" "110000450716"
#> [46] "110000451136" "110000455837" "110000461438" "110000479321" "110000489999"
#> [51] "110000490031" "110000490371" "110000490380" "110000490488" "110000491058"
#> [56] "110000491085" "110000492565" "110000496687" "110000497043" "110000497463"
#> [61] "110000498159" "110000513579" "110000580522" "110000586269" "110000587008"
#> [66] "110000587767" "110000588668" "110000588711" "110000588935" "110000589168"
#> [71] "110000589346" "110000589667" "110000589676" "110000589774" "110000590833"
#> [76] "110000595222" "110000599059" "110000600387" "110000603482" "110000604793"
#> [81] "110000606041" "110000606050" "110000608478" "110000608566" "110000610198"
#> [86] "110000617896" "110000617903" "110000619082" "110000619368" "110000619563"
#> [91] "110000620098" "110000700144" "110000702393" "110000726689" "110000738729"
#> [96] "110000741172" "110000764913" "110000772290" "110000773388" "110000789399"
#> [101] "110000799431" "110000862905" "110000872011" "110000873804" "110000877953"
#> [106] "110000900400" "110000913575" "110001089296" "110001141327" "110001147116"
#> [111] "110001147170" "110001147232" "110001283753" "110001448504" "110001496140"
#> [116] "110001506889" "110001916954" "110002020599" "110002083156" "110002102206"
#> [121] "110002102947" "110002108530" "110002113953" "110002118798" "110002120124"
#> [126] "110002121908" "110002135939" "110002136297" "110002141003" "110002147891"
#> [131] "110002148658" "110002152802" "110002178777" "110002228170" "110002319214"
#> [136] "110002332707" "110002343982" "110002369222" "110002461586" "110002463904"
#> [141] "110002469490" "110002469971" "110002528854" "110002528863" "110002556662"
#> [146] "110003025832" "110003266340" "110003266634" "110003398788" "110003512351"
#> [151] "110003541793" "110003619166" "110003679038" "110003811073" "110003995053"
#> [156] "110004037612" "110004793518" "110004961391" "110004988924" "110005112831"
#> [161] "110005326520" "110005539266" "110005669428" "110005790812" "110005993979"
#> [166] "110006015784" "110006028743" "110006067683" "110006464940" "110006493801"
#> [171] "110006525894" "110006533135" "110006537970" "110007032610" "110007229506"
#> [176] "110007331100" "110007355539" "110007397771" "110007434829" "110007489618"
#> [181] "110007489921" "110007657892" "110007722955" "110007821330" "110007840408"
#> [186] "110007916185" "110007916443" "110008144392" "110008157573" "110008183713"
#> [191] "110008242357" "110008269505" "110008445717" "110008568086" "110008805551"
#> [196] "110008805837" "110009071119" "110009071388" "110009072494" "110009074358"
#> [201] "110009312975" "110009355134" "110009496562" "110009846541" "110010070421"
#> [206] "110010858919" "110011636989" "110012275377" "110012503941" "110012551531"
#> [211] "110012602246" "110012704029" "110014139280" "110014146003" "110014155243"
#> [216] "110014157651" "110014160335" "110014166589" "110014170066" "110014170725"
#> [221] "110014171056" "110014171485" "110014193265" "110014193425" "110014205831"
#> [226] "110014212636" "110014214402" "110014221939" "110014271974" "110014273231"
#> [231] "110014279066" "110014374658" "110015596835" "110015596942" "110015644702"
#> [236] "110015670530" "110015762619" "110015823206" "110018241133" "110018495305"
#> [241] "110018873405" "110019879470" "110020684927" "110021122612" "110021271764"
#> [246] "110021299272" "110022448109" "110022522705" "110022810546" "110022883076"
#> [251] "110022904794" "110024285619" "110024420296" "110024423952" "110024448169"
#> [256] "110024559333" "110024886925" "110025077814" "110025099300" "110025140666"
#> [261] "110025158567" "110025252401" "110025331246" "110027214798" "110028001025"
#> [266] "110029068069" "110029527590" "110030314292" "110030451623" "110030490412"
#> [271] "110030530717" "110031002457" "110031389110" "110031416215" "110031433465"
#> [276] "110031469382" "110032489279" "110032745661" "110032968331" "110032995864"
#> [281] "110033012255" "110033135792" "110033141375" "110033721006" "110034052183"
#> [286] "110034421249" "110034480167" "110035485766" "110035663396" "110035765972"
#> [291] "110035847073" "110037257287" "110037268676" "110037424392" "110037752385"
#> [296] "110037795124" "110037795954" "110037832414" "110037902099" "110037914040"
#> [301] "110038018962" "110038090936" "110038877212" "110038954735" "110039383315"
#> [306] "110039494464" "110040370915" "110040432680" "110040495363" "110040513334"
#> [311] "110040528113" "110040917960" "110041063675" "110041564367" "110041591774"
#> [316] "110041969313" "110041992939" "110042000866" "110042000937" "110042002383"
#> [321] "110042072342" "110042123671" "110042267677" "110042294530" "110042554483"
#> [326] "110042622114" "110043086175" "110043192899" "110043429117" "110043508291"
#> [331] "110043546311" "110043631077" "110043744240" "110043855316" "110043894131"
#> [336] "110044905127" "110045551121" "110046314038" "110046433631" "110046541042"
#> [341] "110050813938" "110054260205" "110054410534" "110054819942" "110054832437"
#> [346] "110054849321" "110055947034" "110056267864" "110057193781" "110057218069"
#> [351] "110057350940" "110057618886" "110058284707" "110058353464" "110058969460"
#> [356] "110059705831" "110059789634" "110060096756" "110060158449" "110060280342"
#> [361] "110060375311" "110061077016" "110061966074" "110063653096" "110064111679"
#> [366] "110064122408" "110064122612" "110064123899" "110064125673" "110064125682"
#> [371] "110064148818" "110064174708" "110064178688" "110064178786" "110064184788"
#> [376] "110064185402" "110064208450" "110064214014" "110064215745" "110064264754"
#> [381] "110064275430" "110064291047" "110064297666" "110064341859" "110064350820"
#> [386] "110064559240" "110064696752" "110064836771" "110064860743" "110065946114"
#> [391] "110066854808" "110066940289" "110067041383" "110067112699" "110067188563"
#> [396] "110067211065" "110067572488" "110067884873" "110067960326" "110069446479"
#> [401] "110069518604" "110069633196" "110070003982" "110070032473" "110070121063"
#> [406] "110070123292" "110070126274" "110070132827" "110070161727" "110070204337"
#> [411] "110070214167" "110070225388" "110070244570" "110070246139" "110070265713"
#> [416] "110070274724" "110070275096" "110070275670" "110070279898" "110070280352"
#> [421] "110070282492" "110070283632" "110070283990" "110070287663" "110070291235"
#> [426] "110070291314" "110070293044" "110070293074" "110070295689" "110070295876"
#> [431] "110070298013" "110070298886" "110070298980" "110070299527" "110070305765"
#> [436] "110070308071" "110070309105" "110070309891" "110070310556" "110070311083"
#> [441] "110070311085" "110070311555" "110070312600" "110070318182" "110070319746"
#> [446] "110070321547" "110070321662" "110070322568" "110070322569" "110070322884"
#> [451] "110070323012" "110070325038" "110070325290" "110070325853" "110070326067"
#> [456] "110070329424" "110070329425" "110070329711" "110070330580" "110070331105"
#> [461] "110070333486" "110070334116" "110070340184" "110070341489" "110070344007"
#> [466] "110070347833" "110070347918" "110070362634" "110070427401" "110070454038"
#> [471] "110070458460" "110070477841" "110070536232" "110070537353" "110070550793"
#> [476] "110070557799" "110070559555" "110070591096" "110070630951" "110070635089"
#> [481] "110070638862" "110070648380" "110070716203" "110070830340" "110070834914"
#> [486] "110070877448" "110071084747" "110071138432" "110071307905" "110071334308"
#> [491] "110071380069" "110071414306" "110071434078" "110071435478" "110071453611"
#> [496] "110071541172" "110071542524" "110071658376" "110071671502" "110071711098"
#> [501] "110071711268" "110071719037" "110071720368" "110071720642" "110071722397"
#> [506] "110071722611" "110071777258" "110071778088" "110071778853" "110071786185"
#> [511] "110071786565" "110071793196" "110071803235" "110071870928" "110071870933"
#> [516] "110071871479" "110071871691" "110071872027" "110071872028" "110071872489"
latlon_from_naics(321114)
#> lat lon REGISTRY_ID NAICS naics_found naics_query
#> <num> <num> <char> <num> <num> <num>
#> 1: 18.41565 -66.20266 110000307837 321114 321114 321114
#> 2: 42.69898 -73.82309 110000324505 321114 321114 321114
#> 3: 41.71228 -73.93705 110000324809 321114 321114 321114
#> 4: 40.08626 -76.18172 110000333826 321114 321114 321114
#> 5: 38.29444 -75.62500 110000340667 321114 321114 321114
#> ---
#> 516: 45.60380 -121.13840 110071871479 321114 321114 321114
#> 517: 47.24861 -122.24388 110071871691 321114 321114 321114
#> 518: 48.77027 -122.51444 110071872027 321114 321114 321114
#> 519: 31.65100 -95.07278 110071872028 321114 321114 321114
#> 520: 33.03596 -86.97750 110071872489 321114 321114 321114
# latlon_from_naics(naics_from_any("cheese")[,code] )
latlon_from_naics("cheese")
#> lat lon REGISTRY_ID NAICS naics_found naics_query
#> <num> <num> <char> <num> <num> <char>
#> 1: 44.19644 -75.95834 110000326102 311513 311513 cheese
#> 2: 42.87593 -78.86949 110000326825 311513 311513 cheese
#> 3: 42.83404 -78.82388 110000327183 311513 311513 cheese
#> 4: 41.47924 -81.07851 110000385798 311513 311513 cheese
#> 5: 43.59796 -85.14557 110000410207 311513 311513 cheese
#> ---
#> 423: 37.31066 -121.01631 110071801475 311513 311513 cheese
#> 424: 37.49247 -120.84806 110071801697 311513 311513 cheese
#> 425: 37.49251 -120.87657 110071807026 311513 311513 cheese
#> 426: 44.34500 -87.82780 110071872375 311513 311513 cheese
#> 427: 45.33148 -92.17117 110071882462 311513 311513 cheese
head(latlon_from_naics(c(3366, 33661, 336611), id_only=TRUE))
#> [1] "110003914916" "110000897290" "110000360564" "110000369869" "110000377333"
#> [6] "110000448962"
head(regid_from_naics(c(3366, 33661, 336611)))
#> [1] "110003914916" "110000897290" "110000360564" "110000369869" "110000377333"
#> [6] "110000448962"
head(regid_from_naics(3366, children = TRUE))
#> [1] "110003914916" "110000897290" "110000360564" "110000369869" "110000377333"
#> [6] "110000448962"
# mapfast(frs_from_naics(336611)) # simple map
# get name from one code
naics_from_code(336)$name
#> Error in naics_from_code(336): could not find function "naics_from_code"
# get the name from each code
mycode = c(33611, 336111, 336112)
naics_from_code(mycode)$name
#> Error in naics_from_code(mycode): could not find function "naics_from_code"
# see counts of facilities by code (parent) and subcategories (children)
naics_counts[NAICS %in% mycode, ]
#> # A tibble: 3 × 6
#> name NAICS count_w_subs count_no_subs label_w_subs label_no_subs
#> <chr> <dbl> <int> <int> <chr> <chr>
#> 1 33611 - Automobi… 33611 326 1 33611 - Aut… 33611 - Auto…
#> 2 336111 - Automob… 336111 271 271 336111 - Au… 336111 - Aut…
#> 3 336112 - Light T… 336112 72 72 336112 - Li… 336112 - Lig…
# see parent codes that contain each code
naicstable[code %in% mycode, ]
#> code n2 n3 n4 n5 n6
#> <num> <char> <char> <char> <char> <char>
#> 1: 33611 33 336 3361 33611 33611
#> 2: 336111 33 336 3361 33611 336111
#> 3: 336112 33 336 3361 33611 336112
#> name
#> <char>
#> 1: Automobile and Light Duty Motor Vehicle Manufacturing
#> 2: Automobile Manufacturing
#> 3: Light Truck and Utility Vehicle Manufacturing
#> num_name
#> <char>
#> 1: 33611 - Automobile and Light Duty Motor Vehicle Manufacturing
#> 2: 336111 - Automobile Manufacturing
#> 3: 336112 - Light Truck and Utility Vehicle Manufacturing
# how many were found via each naics code?
found = latlon_from_naics(c(211,331))
x = table( found$naics_found, found$naics_query)
x = x[order(x[, 1],decreasing = T),]
x
#>
#> 211 331
#> 211130 14513 0
#> 211120 10344 0
#> 211 2 0
#> 331 0 7
#> 3312 0 1
#> 3314 0 2
#> 3315 0 1
#> 33121 0 5
#> 33122 0 1
#> 33141 0 2
#> 33142 0 5
#> 33151 0 1
#> 33152 0 1
#> 331110 0 335
#> 331210 0 513
#> 331221 0 478
#> 331222 0 221
#> 331313 0 70
#> 331314 0 239
#> 331315 0 100
#> 331318 0 169
#> 331410 0 104
#> 331420 0 146
#> 331491 0 215
#> 331492 0 370
#> 331511 0 498
#> 331512 0 124
#> 331513 0 271
#> 331523 0 232
#> 331524 0 381
#> 331529 0 192
# }