Augment data with information from fitted model objects
Source:R/augment.R
, R/augment_glm.R
augment.spmodel.Rd
Augment accepts a fitted model object and a data set and adds
information about each observation in the data set. New columns always
begin with a .
prefix to avoid overwriting columns in the original
data set.
Augment behaves differently depending on whether the original data or new data
requires augmenting. Typically, when augmenting the original data, only the fitted
model object is specified, and when augmenting new data, the fitted model object
and newdata
is specified. When augmenting the original data, diagnostic
statistics are augmented to each row in the data set. When augmenting new data,
predictions and optional intervals or standard errors are augmented to each
row in the new data set.
Usage
# S3 method for splm
augment(
x,
drop = TRUE,
newdata = NULL,
se_fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95,
local,
...
)
# S3 method for spautor
augment(
x,
drop = TRUE,
newdata = NULL,
se_fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95,
local,
...
)
# S3 method for spglm
augment(
x,
drop = TRUE,
newdata = NULL,
type = c("link", "response"),
se_fit = FALSE,
interval = c("none", "confidence", "prediction"),
newdata_size,
level = 0.95,
local = local,
var_correct = TRUE,
...
)
# S3 method for spgautor
augment(
x,
drop = TRUE,
newdata = NULL,
type = c("link", "response"),
se_fit = FALSE,
interval = c("none", "confidence", "prediction"),
newdata_size,
level = 0.95,
local,
var_correct = TRUE,
...
)
Arguments
- x
- drop
A logical indicating whether to drop extra variables in the fitted model object
x
when augmenting. The default fordrop
isTRUE
.drop
is ignored if augmentingnewdata
.- newdata
A data frame or tibble containing observations requiring prediction. All of the original explanatory variables used to create the fitted model object
x
must be present innewdata
. Defaults toNULL
, which indicates that nothing has been passed tonewdata
.- se_fit
Logical indicating whether or not a
.se.fit
column should be added to augmented output. Passed topredict()
and defaults toFALSE
.- interval
Character indicating the type of confidence interval columns to add to the augmented
newdata
output. Passed topredict()
and defaults to"none"
.- level
Tolerance/confidence level. The default is
0.95
.- local
A list or logical. If a list, specific list elements described in
predict.spmodel()
control the big data approximation behavior. If a logical,TRUE
chooses default list elements for the list version oflocal
as specified inpredict.spmodel()
. Defaults toFALSE
, which performs exact computations.- ...
Other arguments. Not used (needed for generic consistency).
- type
The scale (
response
orlink
) of predictions obtained usingspglm()
orspgautor
objects.- newdata_size
The
size
value for each observation innewdata
used when predicting for the binomial family.- var_correct
A logical indicating whether to return the corrected prediction variances when predicting via models fit using
spglm()
orspgautor()
. The default isTRUE
.
Value
When augmenting the original data set, a tibble with additional columns
.fitted
Fitted value.resid
Response residual (the difference between observed and fitted values).hat
Leverage (diagonal of the hat matrix).cooksd
Cook's distance.std.resid
Standardized residuals.se.fit
Standard error of the fitted value.
When augmenting a new data set, a tibble with additional columns
.fitted
Predicted (or fitted) value.lower
Lower bound on interval.upper
Upper bound on interval.se.fit
Standard error of the predicted (or fitted) value
Details
augment()
returns a tibble with the same class as
data
. That is, if data
is
an sf
object, then the augmented object (obtained via augment(x)
)
will be an sf
object as well. When augmenting newdata
, the
augmented object has the same class as data
.
Missing response values from the original data can be augmented as if
they were a newdata
object by providing x$newdata
to the
newdata
argument (where x
is the name of the fitted model
object). This is the only way to compute predictions for
spautor()
and spgautor()
fitted model objects.
Examples
spmod <- splm(z ~ water + tarp,
data = caribou,
spcov_type = "exponential", xcoord = x, ycoord = y
)
augment(spmod)
#> # A tibble: 30 × 10
#> z water tarp x y .fitted .resid .hat .cooksd .std.resid
#> <dbl> <fct> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 2.42 Y clear 1 6 1.97 0.454 0.116 0.209 2.53
#> 2 2.44 Y shade 2 6 2.25 0.190 0.137 0.0468 1.09
#> 3 1.81 Y none 3 6 2.05 -0.237 0.137 0.0752 -1.38
#> 4 1.97 N clear 4 6 2.05 -0.0838 0.174 0.00211 -0.200
#> 5 2.38 N shade 5 6 2.34 0.0407 0.153 0.0159 0.594
#> 6 2.22 Y none 1 5 2.05 0.177 0.147 0.0434 1.00
#> 7 2.10 N clear 2 5 2.05 0.0512 0.156 0.00936 0.450
#> 8 1.80 Y clear 3 5 1.97 -0.163 0.122 0.0135 -0.624
#> 9 1.96 Y shade 4 5 2.25 -0.290 0.119 0.0642 -1.38
#> 10 2.10 Y none 5 5 2.05 0.0522 0.131 0.0264 0.837
#> # ℹ 20 more rows
spmod_sulf <- splm(sulfate ~ 1, data = sulfate, spcov_type = "exponential")
augment(spmod_sulf)
#> Simple feature collection with 197 features and 6 fields
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -2292550 ymin: 386181.1 xmax: 2173345 ymax: 3090370
#> Projected CRS: NAD83 / Conus Albers
#> # A tibble: 197 × 7
#> sulfate .fitted .resid .hat .cooksd .std.resid geometry
#> * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <POINT [m]>
#> 1 12.9 5.92 7.00 0.00334 0.00161 -0.694 (817738.8 1080571)
#> 2 20.2 5.92 14.2 0.00256 0.00192 0.865 (914593.6 1295545)
#> 3 16.8 5.92 10.9 0.00259 0.000395 0.390 (359574.1 1178228)
#> 4 16.2 5.92 10.3 0.00239 0.000363 0.390 (265331.9 1239089)
#> 5 7.86 5.92 1.93 0.00202 0.00871 -2.07 (304528.8 1453636)
#> 6 15.4 5.92 9.43 0.00201 0.000240 0.345 (162932.8 1451625)
#> 7 0.986 5.92 -4.94 0.00380 0.000966 -0.503 (-1437776 1568022)
#> 8 0.425 5.92 -5.50 0.0138 0.00584 -0.646 (-1572878 1125529)
#> 9 3.58 5.92 -2.34 0.00673 0.0000148 -0.0467 (-1282009 1204889)
#> 10 2.38 5.92 -3.54 0.0123 0.0000139 -0.0335 (-1972775 1464991)
#> # ℹ 187 more rows
augment(spmod_sulf, newdata = sulfate_preds)
#> Simple feature collection with 100 features and 1 field
#> Geometry type: POINT
#> Dimension: XY
#> Bounding box: xmin: -2283774 ymin: 582930.5 xmax: 1985906 ymax: 3037173
#> Projected CRS: NAD83 / Conus Albers
#> # A tibble: 100 × 2
#> .fitted geometry
#> * <dbl> <POINT [m]>
#> 1 1.62 (-1771413 1752976)
#> 2 24.4 (1018112 1867127)
#> 3 8.95 (-291256.8 1553212)
#> 4 16.5 (1274293 1267835)
#> 5 4.93 (-547437.6 1638825)
#> 6 26.8 (1445080 1981278)
#> 7 2.87 (-1629090 3037173)
#> 8 14.3 (1302757 1039534)
#> 9 1.53 (-1429838 2523494)
#> 10 14.3 (1131970 1096609)
#> # ℹ 90 more rows
# missingness in original data
spmod_seal <- spautor(log_trend ~ 1, data = seal, spcov_type = "car")
augment(spmod_seal)
#> Simple feature collection with 34 features and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 980001.5 ymin: 1010815 xmax: 1116002 ymax: 1145054
#> Projected CRS: NAD83 / Alaska Albers
#> # A tibble: 34 × 7
#> log_trend .fitted .resid .hat .cooksd .std.resid geometry
#> * <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <POLYGON [m]>
#> 1 -0.282 -0.0709 -0.211 0.0161 0.0209 -1.13 ((1037002 1039492, 10370…
#> 2 -0.00121 -0.0709 0.0697 0.0473 0.0256 0.718 ((1070158 1030216, 10701…
#> 3 0.0354 -0.0709 0.106 0.0290 0.0183 0.782 ((1054906 1034826, 10549…
#> 4 -0.0160 -0.0709 0.0549 0.0228 0.00157 0.260 ((1025142 1056940, 10251…
#> 5 0.0872 -0.0709 0.158 0.0276 0.0383 1.16 ((1026035 1044623, 10260…
#> 6 -0.266 -0.0709 -0.195 0.0287 0.0530 -1.34 ((1100345 1060709, 11002…
#> 7 0.0743 -0.0709 0.145 0.0491 0.0901 1.32 ((1030247 1029637, 10302…
#> 8 -0.00961 -0.0709 0.0613 0.0122 0.00242 0.442 ((1116002 1024542, 11160…
#> 9 -0.182 -0.0709 -0.111 0.0225 0.0224 -0.986 ((1079864 1025088, 10798…
#> 10 0.00351 -0.0709 0.0744 0.0316 0.0100 0.555 ((1110363 1037056, 11103…
#> # ℹ 24 more rows
augment(spmod_seal, newdata = spmod_seal$newdata)
#> Simple feature collection with 28 features and 2 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 913618.8 ymin: 1007542 xmax: 1115097 ymax: 1132682
#> Projected CRS: NAD83 / Alaska Albers
#> # A tibble: 28 × 3
#> log_trend .fitted geometry
#> * <dbl> <dbl> <POLYGON [m]>
#> 1 NA -0.115 ((1035002 1054710, 1035002 1054542, 1035002 1053542, 1035…
#> 2 NA -0.00908 ((1043093 1020553, 1043097 1020550, 1043101 1020550, 1043…
#> 3 NA -0.0602 ((1099737 1054310, 1099752 1054262, 1099788 1054278, 1099…
#> 4 NA -0.0359 ((1099002 1036542, 1099134 1036462, 1099139 1036431, 1099…
#> 5 NA -0.0723 ((1076902 1053189, 1076912 1053179, 1076931 1053179, 1076…
#> 6 NA -0.0548 ((1070501 1046969, 1070317 1046598, 1070308 1046542, 1070…
#> 7 NA -0.0976 ((1072995 1054942, 1072996 1054910, 1072997 1054878, 1072…
#> 8 NA -0.0714 ((960001.5 1127667, 960110.8 1127542, 960144.1 1127495, 9…
#> 9 NA -0.0825 ((1031308 1079817, 1031293 1079754, 1031289 1079741, 1031…
#> 10 NA -0.0592 ((998923.7 1053647, 998922.5 1053609, 998950 1053631, 999…
#> # ℹ 18 more rows