Fit random forest residual spatial linear models for areal data (i.e., spatial autoregressive models) using random forest to fit the mean and a spatial linear model to fit the residuals. The spatial linear model fit to the residuals can incorporate a variety of estimation methods, allowing for random effects, partition factors, and row standardization.
spautorRF(formula, data, ...)
A two-sided linear formula describing the fixed effect structure
of the model, with the response to the left of the ~
operator and
the terms on the right, separated by +
operators.
A data frame or sf
object object that contains
the variables in fixed
, random
, and partition_factor
as well as geographical information. If an sf
object is
provided with POINT
geometries, the x-coordinates and y-coordinates
are used directly. If an sf
object is
provided with POLYGON
geometries, the x-coordinates and y-coordinates
are taken as the centroids of each polygon.
Additional named arguments to ranger::ranger()
or spautor()
.
A list with several elements to be used with predict()
. These
elements include the function call (named call
), the random forest object
fit to the mean (named ranger
),
the spatial linear model object fit to the residuals
(named spautor
or spautor_list
), and an object can contain data for
locations at which to predict (called newdata
). The newdata
object contains the set of
observations in data
whose response variable is NA
.
If spcov_type
or spcov_initial
(which are passed to spautor()
)
are length one, the list has class spautorRF
and the spatial linear
model object fit to the residuals is called spautor
, which has
class spautor
. If
spcov_type
or spcov_initial
are length greater than one, the
list has class spautorRF_list
and the spatial linear model object
fit to the residuals is called spautor_list
, which has class spautor_list
.
and contains several objects, each with class spautor
.
The random forest residual spatial linear model is described by
Fox et al. (2020). A random forest model is fit to the mean portion of the
model specified by formula
using ranger::ranger()
. Residuals
are computed and used as the response variable in an intercept-only spatial
linear model fit using spautor()
. This model object is intended for use with
predict()
to perform prediction, also called random forest
regression Kriging.
Fox, E.W., Ver Hoef, J. M., & Olsen, A. R. (2020). Comparing spatial regression to random forests for large environmental data sets. PloS one, 15(3), e0229509.
# \donttest{
seal$var <- rnorm(NROW(seal)) # add noise variable
sprfmod <- spautorRF(log_trend ~ var, data = seal, spcov_type = "car")
predict(sprfmod)
#> 1 9 13 15 18 19
#> 0.191289243 0.284725478 0.098474510 0.213170388 0.061184057 0.048046860
#> 27 32 36 40 42 43
#> -0.216770970 -0.002182298 0.015682916 -0.055002079 -0.003323380 -0.005652809
#> 44 46 47 48 49 50
#> -0.179721107 -0.082635754 -0.004155727 0.047340625 -0.035348073 -0.032214608
#> 51 52 53 54 55 56
#> -0.258981418 -0.026285442 -0.026205602 -0.002136687 0.228756727 -0.014099922
#> 57 58 61 62 76 77
#> 0.011307237 0.017018646 0.022255174 0.035646318 -0.039739885 -0.002533666
#> 79 87 91 92 101 105
#> -0.205493862 -0.099810849 0.058526491 -0.031632368 0.068037245 -0.083526612
#> 108 130 131 132 133 134
#> 0.106377285 -0.171354511 0.001556021 0.053753664 -0.136405269 0.069645654
#> 135 136 137 138 139 140
#> 0.386933047 -0.103309942 0.118536432 0.080583114 -0.014424908 0.089068032
#> 141 142 143 144 145 146
#> 0.035438158 -0.042671779 -0.019594251 -0.099229070 0.020954444 -0.025260675
#> 147
#> 0.088093531
# }