Skip to contents

Column names are in the order of how they appear in each output dataset.

Macroinvertebrates and Fish

Information shared across datasets

Agency

Description: Agency that oversaw the collection of the sample.

Categorical. Levels:

  • “EPA” - United States Environmental Protection Agency
  • “USGS” - United States Geological Survey

SampleID

Description: Sample identification number - a unique sample identifier. For USGS data, this is ‘SIDNO’ variable and BDB prefix indicates that BioData system is source of the identifier. For EPA samples, this is the ‘UID’ variable for NRSA 08/09, 13/14, and 18/19, but is the joining of ‘SITE_ID’ and ‘VISIT_NO’ (as 200004-‘SITE_ID’-‘VISIT_NO’) variables for 2000-2004 EPA data (EMAP-West and WSA).

Categorical.

ProjectLabel

Description: Project short name assigned by the project owner. For USGS data, ProjectLabel is explicitly provided by the USGS. For EPA, ProjectLabel is the sampling round/type (e.g., NRSA0809).

Categorical.

SiteNumber

Description: Unique site identifiers. For USGS data, SiteNumber is USGS National Water Information System (NWIS) 8 to 15-digit identifier for the place for where sample was collected (preceeded by “USGS-”). For EPA data, SiteNumber is the NARS “SITE_ID” variable, with the preferred form of NRS_MM-xxxxx, where MM is two letter state code, and xxxxx is between 10001 and 99999. EPA unique site ID (UNIQUE_ID) is used in place of SITE_ID , to link sites temporally. SITE_IDs from individual rounds of NRSA sampling (WSA/EMAP included) were linked to a site-level UNIQUE_ID, so that sites that were resampled through time could be linked. If sites could not be linked to a UNIQUE_ID, survey-specific SITE_ID is used instead.

Categorical. EPA preferred form is NRS_MM-xxxxx, where MM is two letter state code, and xxxxx is between 10001 and 99999. USGS form is USGS-xxxxxxxx, where xxxxxxxx is the 8 to 15-digit USGS NWIS identifier.

Note: To link EPA unique site IDs to survey-specific site IDs (e.g., to be used to link them to other NARS collected data), users can link unified site IDs to the survey-specific ID using the following code:

dataset$SurveySpecificSiteID <-
  ifelse(dataset$SiteNumber %in% finsyncR:::.NRSA_siteIDs$UNIQUE_ID,
         finsyncR:::.NRSA_siteIDs$SITE_ID[
           match(dataset$SiteNumber,
                 finsyncR:::.NRSA_siteIDs$UNIQUE_ID)],
         ifelse(dataset$SiteNumber %in% finsyncR:::.NRSA_siteIDs$MASTER_SITEID,
                finsyncR:::.NRSA_siteIDs$SITE_ID[
                  match(dataset$SiteNumber,
                        finsyncR:::.NRSA_siteIDs$MASTER_SITEID)],
                dataset$SiteNumber))

StudyReachName

Description: Name given to identify the longitudinal section of a stream that was selected for sampling.

Categorical.

CollectionDate

Description: Date sample was collected (YYYY-MM-DD)

Date.

CollectionYear

Description: Year in which sample was collected (YYYY)

Integer.

CollectionMonth

Description: Month in which sample was collected (MM)

Integer.

CollectionDayOfYear

Description: Day-of-year when sample was collected. The number of days since the beginning of the year.

Integer.

Latitude_dd

Description: Latitude of the site location, in decimal degrees

Numeric.

Longitude_dd

Description: Longitude of the site location, in decimal degrees

Numeric.

CoordinateDatum

Description: Datum of site location coordinates

Categorical. Levels:

  • NAD83 - North American Datum 1983, EPSG: 4269
  • NAD27 - North American Datum 1927, EPSG: 4267
  • OLDHI - Old Hawaiian, EPSG: 4135

COMID

Description: Waterbody unique identifier from medium resolution US National Hydrography Dataset Plus (NHDPlus) version 2.1.

Integer.

StreamOrder

Description: Strahler stream order. Moving waters of the first order are the outermost tributaries. If two streams of the same order merge, the resulting stream is given a number that is one higher. If two rivers with different stream orders merge, the resulting stream is given the higher of the two numbers.

Integer.

WettedWidth

Description: Provided for EPA samples only. Measured distance from left to right bank of the stream channel filled with water. Meters.

Numeric.

PredictedWettedWidth

Description: Predicted distance from left to right bank of the stream channel filled with water. Gathered from StreamCat and Doyle, Jessie M., Hill, Ryan A., Leibowitz, Scott G. and Ebersole, Joseph L.. 2023. "Random forest models to estimate bankfull and low flow channel widths and depths across the conterminous United States." JAWRA Journal of the American Water Resources Association 59(5): 1099–1114. https://doi.org/10.1111/1752-1688.13116. Provided for most sites. Meters.

Numeric.

NARS_Ecoregion

Description: Ecoregions are regions that contain similar environmental characteristics, including climate, vegetation, and geology. The nine NARS ecoregions are aggregations of Omernik Level III ecoregions.

Categorical. Levels:

  • CPL: Coastal Plains
  • NAP: Northern Appalachians
  • NPL: Northern Plains
  • SAP: Southern Appalachians
  • SPL: Southern Plains
  • TPL: Temperate Plains
  • UMW: Upper Midwest
  • WMT: Western Mountains
  • XER: Xeric

SampleTypeCode

Description: Code indicating the method of sample collected

Categorical. Levels:

Inverts:

Fish:

Data included in invertebrate data only

AreaSampTot_m2

Description: Estimated total area sampled, in square meters.

Numeric.

FieldSplitRatio

Description: For USGS samples only, proportion of the field collected sample that was sent to the lab for identification. See “Getting Started” vignette for more information.

Numeric. 0-1

LabSubsamplingRatio

Description: Proportion of the lab sample that was actually identified (proportion of grids that were used to reach target identification count). See “Getting Started” vignette for more information.

Numeric. 0-1

PropID

Description: Proportion of the sample that was identified. For USGS samples, FieldSplitRatio * LabSubsamplingRatio. For EPA samples, LabSubsampling ratio. See “Getting Started” vignette for more information.

Numeric. 0-1

Gen_ID_Prop

Description: Proportion of the individuals identified in the lab that were successfully identified to at least the Genus taxonomic level.

Numeric. 0-1

Data included in fish data only

FishCollection

Description: Whether fish sampling was conducted at a site, but no fish were collected. This column provides instances of true zero richness and abundance/density.

Categorical. Levels:

  • Fish Collected
  • No Fish Collected

ReachLengthFished_m

Description: Length of stream covered by fishing activities conducted for this sample. ReachLengthFished_m is equal to the curvilinear reach length of the assigned study reach.

Numeric.

SampleMethod

Description: Method of sample collection for fish. All electrofishing (boats, backpack, etc.) is classified as “Shocking”. All netting has been classified as “Seine”. Snorkel samples have been classified as “Snorkel”.

Categorical. Levels:

  • Shocking
  • Seine
  • Snorkel

MethodEffort

Description: The sampling effort for each sample (values summed across rounds if multiple rounds of a single sampling method occurred for a given sampling event).

Numeric.

MethodEffort_units

Description: The units for sampling effort for each sample. Units are minutes for electroshocking, number of hauls for seine netting, and number of transects for snorkel samples.

Categorical.

NLCD Data

SiteNumber

Description: Unique site identifiers. For USGS data, SiteNumber is USGS National Water Information System (NWIS) 8 to 15-digit identifier for the place for where sample was collected (preceeded by “USGS-”). For EPA data, SiteNumber is the NARS “SITE_ID” variable, with the preferred form of NRS_MM-xxxxx, where MM is two letter state code, and xxxxx is between 10001 and 99999. EPA unique site ID (UNIQUE_ID) is used in place of SITE_ID , to link sites temporally. SITE_IDs from individual rounds of NRSA sampling (WSA/EMAP included) were linked to a site-level UNIQUE_ID, so that sites that were resampled through time could be linked. If sites could not be linked to a UNIQUE_ID, survey-specific SITE_ID is used instead.

Categorical. EPA preferred form is NRS_MM-xxxxx, where MM is two letter state code, and xxxxx is between 10001 and 99999. USGS form is USGS-xxxxxxxx, where xxxxxxxx is the 8 to 15-digit USGS NWIS identifier.

CollectionYear

Description: Year in which sample was collected (YYYY)

Integer.

Ungrouped NLCD columns (group = FALSE)

PCTURBMD[_AOI]

Description: Percent of area of interest (AOI) classified as developed, medium-intensity land use (NLCD class 23).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest).

Continuous. Percent.

PCTMXFST[_AOI]

Description: Percent of area of interest (AOI) classified as mixed deciduous/evergreen forest land cover (NLCD class 43).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTHAY[_AOI]

Description: Percent of area of interest (AOI) classified as hay land use (NLCD class 81).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTURBLO[_AOI]

Description: Percent of area of interest (AOI) classified as developed, low-intensity land use (NLCD class 22).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTSHRB[_AOI]

Description: Percent of area of interest (AOI) classified as shrub/scrub land cover (NLCD class 52).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTURBHI[_AOI]

Description: Percent of area of interest (AOI) classified as developed, high-intensity land use (NLCD class 24).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTHBWET[_AOI]

Description: Percent of area of interest (AOI) classified as herbaceous wetland land cover (NLCD class 95).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTCROP[_AOI]

Description: Percent of area of interest (AOI) classified as crop land use (NLCD class 82).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTGRS[_AOI]

Description: Percent of area of interest (AOI) classified as grassland/herbaceous land cover (NLCD class 71).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTOW[_AOI]

Description: Percent of area of interest (AOI) classified as open water land cover (NLCD class 11).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTBL[_AOI]

Description: Percent of area of interest (AOI) classified as barren land cover (NLCD class 31).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTCONIF[_AOI]

Description: Percent of area of interest (AOI) classified as evergreen forest land cover (NLCD class 42).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTDECID[_AOI]

Description: Percent of area of interest (AOI) classified as deciduous forest land cover (NLCD class 41).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTURBOP[_AOI]

Description: Percent of area of interest (AOI) classified as developed, open space land use (NLCD class 21).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PCTWDWET[_AOI]

Description: Percent of area of interest (AOI) classified as woody wetland land cover (NLCD class 90).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

Grouped NLCD Columns (group = TRUE)

PctCrop[_AOI]

Description: Percent of area of interest (AOI) classified as crop land use (NLCD class 82).

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PctFst[_AOI]

Description: Percent of area of interest (AOI) classified as deciduous forest, evergreen forest, or mixed deciduous/evergreen forest land covers (NLCD class 41, 42, 43). Values are summed percentages across these land cover classes.

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PctOpn[_AOI]

Description: Percent of area of interest (AOI) classified as shrub/scrub, barren, grassland/herbaceous, or hay land covers/uses (NLCD class 31, 52, 71, 81). Values are summed percentages across these land cover/use classes.

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PctUrb[_AOI]

Description: Percent of area of interest (AOI) classified as developed, low-intensity; developed, medium-intensity; developed, high-intensity; or developed, open space land uses (NLCD classes 21, 22, 23, 24). Values are summed percentages across these land use classes.

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.

PctWater[_AOI]

Description: Percent of area of interest (AOI) classified as open water, woody wetland, or herbaceous wetland land covers (NLCD classes 11, 90, 95). Values are summed percentages across these land cover classes.

Available area of interest (AOI): Cat, Ws (_Cat or _Ws follows column specifying user input area of interest)

Continuous. Percent.