Augment data with information from fitted model objects

Augment accepts a fitted model object and a data set and adds information about each observation in the data set. New columns always begin with a . prefix to avoid overwriting columns in the original data set.

Augment behaves differently depending on whether the original data or new data requires augmenting. Typically, when augmenting the original data, only the fitted model object is specified, and when augmenting new data, the fitted model object and newdata are specified. When augmenting the original data, diagnostic statistics are augmented to each row in the data set. When augmenting new data, predictions and optional intervals (confidence or prediction) or standard errors are augmented to each row in the new data set.

# S3 method for ssn_lm
augment(
  x,
  drop = TRUE,
  newdata = NULL,
  se_fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  level = 0.95,
  ...
)

# S3 method for ssn_glm
augment(
  x,
  drop = TRUE,
  newdata = NULL,
  type = c("link", "response"),
  se_fit = FALSE,
  interval = c("none", "confidence", "prediction"),
  newdata_size,
  level = 0.95,
  var_correct = TRUE,
  ...
)

Arguments

x: A fitted model object from ssn_lm() or ssn_glm().
drop: A logical indicating whether to drop extra variables in the fitted model object x when augmenting. The default for drop is TRUE. drop is ignored if augmenting newdata.
newdata: A vector that contains the names of the prediction sf objects from the original ssn.object requiring prediction. All of the original explanatory variables used to create the fitted model object x must be present in each prediction sf object represented by newdata. Defaults to NULL, which indicates that nothing has been passed to newdata and augmenting occurs for the original data. The value "ssn" is shorthand for specifying all prediction sf objects.
se_fit: Logical indicating whether or not a .se.fit column should be added to augmented output. Passed to predict() and defaults to FALSE.
interval: Character indicating the type of confidence interval columns to add to the augmented newdata output. Passed to predict() and defaults to "none".
level: Tolerance/confidence level. The default is 0.95.
...: Additional arguments to predict() when augmenting newdata.
type: The scale (response or link) of predictions obtained using ssn_glm objects.
newdata_size: The size value for each observation in newdata used when predicting for the binomial family.
var_correct: A logical indicating whether to return the corrected prediction variances when predicting via models fit using ssn_glm. The default is TRUE.

Value

When augmenting the original data set, a tibble with additional columns

.fitted: Fitted value
.resid: Response residual (the difference between observed and fitted values)
.hat: Leverage (diagonal of the hat matrix)
.cooksd: Cook's distance
.std.resid: Standardized residuals
.se.fit: Standard error of the fitted value.

When augmenting a new data set, a tibble with additional columns

.fitted: Predicted (or fitted) value
.lower: Lower bound on interval
.upper: Upper bound on interval
.se.fit: Standard error of the predicted (or fitted) value

When predictions for all prediction objects are desired, the output is a list where each element has a name that matches the prediction objects and values that are the predictions.

Details

augment() returns a tibble as an sf object.

Missing response values from the original data can be augmented as if they were a newdata object by providing ".missing" to the newdata argument.

Examples

# Copy the mf04p .ssn data to a local directory and read it into R
# When modeling with your .ssn object, you will load it using the relevant
# path to the .ssn data on your machine
copy_lsn_to_temp()
temp_path <- paste0(tempdir(), "/MiddleFork04.ssn")
mf04p <- ssn_import(temp_path, predpts = "CapeHorn", overwrite = TRUE)

ssn_mod <- ssn_lm(
  formula = Summer_mn ~ ELEV_DEM,
  ssn.object = mf04p,
  tailup_type = "exponential",
  additive = "afvArea"
)
augment(ssn_mod)
#> Simple feature collection with 45 features and 8 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -1530805 ymin: 920324.3 xmax: -1503079 ymax: 931036.6
#> Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
#> # A tibble: 45 × 9
#>    Summer_mn ELEV_DEM .fitted  .resid    .hat   .cooksd .std.resid pid  
#>  *     <dbl>    <int>   <dbl>   <dbl>   <dbl>     <dbl>      <dbl> <chr>
#>  1      14.9     1947    14.4  0.503  0.111   0.0165         0.545 1    
#>  2      14.7     1952    14.2  0.473  0.0557  0.00173        0.249 2    
#>  3      14.6     1958    14.0  0.568  0.0337  0.00658        0.625 3    
#>  4      15.2     1923    15.2 -0.0164 0.0744  0.00893        0.490 4    
#>  5      14.5     1932    14.9 -0.439  0.0202  0.0158        -1.25  5    
#>  6      15.3     1940    14.7  0.634  0.00569 0.000970       0.584 6    
#>  7      15.1     1940    14.7  0.414  0.00162 0.0000507     -0.250 7    
#>  8      14.9     1945    14.5  0.454  0.0574  0.0143        -0.706 8    
#>  9      15.0     1948    14.4  0.607  0.0739  0.00666        0.425 9    
#> 10      15.0     1950    14.3  0.705  0.0581  0.0196         0.821 10   
#> # ℹ 35 more rows
#> # ℹ 1 more variable: geometry <POINT [m]>
augment(ssn_mod, newdata = "CapeHorn")
#> Simple feature collection with 654 features and 19 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -1516634 ymin: 921030.2 xmax: -1512722 ymax: 924632.2
#> Projected CRS: USA_Contiguous_Albers_Equal_Area_Conic
#> # A tibble: 654 × 20
#>      rid   pid    COMID AREAWTMAP   SLOPE ELEV_DEM FlowCMS AirMEANc AirMWMTc
#>  * <int> <int>    <int>     <dbl>   <dbl>    <int>   <dbl>    <dbl>    <dbl>
#>  1    34  1494 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#>  2    34  1495 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#>  3    34  1496 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#>  4    34  1497 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#>  5    34  1498 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#>  6    34  1499 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#>  7    34  1500 23519461     1087. 0.00843     2013    34.8     21.5     35.5
#>  8    34  1501 23519461     1087. 0.00843     2013    34.8     21.5     35.5
#>  9    34  1502 23519461     1087. 0.00843     2013    34.8     21.5     35.5
#> 10    34  1503 23519461     1087. 0.00843     2011    34.8     21.5     35.5
#> # ℹ 644 more rows
#> # ℹ 11 more variables: rcaAreaKm2 <dbl>, h2oAreaKm2 <dbl>, ratio <dbl>,
#> #   snapdist <dbl>, upDist <dbl>, afvArea <dbl>, locID <int>, netID <dbl>,
#> #   netgeom <chr>, .fitted <dbl>, geometry <POINT [m]>

Augment data with information from fitted model objects

Arguments

Value

Details

See also

Examples