Fit spatial linear models for areal data (i.e., spatial autoregressive models) using a variety of estimation methods, allowing for random effects, partition factors, and row standardization.

```
spautor(
formula,
data,
spcov_type,
spcov_initial,
estmethod = "reml",
random,
randcov_initial,
partition_factor,
W,
row_st = TRUE,
M,
range_positive = TRUE,
cutoff,
...
)
```

- formula
A two-sided linear formula describing the fixed effect structure of the model, with the response to the left of the

`~`

operator and the terms, separated by`+`

operators, on the right.- data
A data frame or

`sf`

object that contains the variables in`fixed`

,`random`

, and`partition_factor`

, as well as potentially geographical information.- spcov_type
The spatial covariance type. Available options include

`"car"`

and`"sar"`

. Parameterizations of each spatial covariance type are available in Details. When`spcov_type`

is specified, relevant spatial covariance parameters are assumed unknown, requiring estimation.`spcov_type`

is not required (and is ignored) if`spcov_initial`

is provided. Multiple values can be provided in a character vector. Then`spautor()`

is called iteratively for each element and a list is returned for each model fit. The default for`spcov_type`

is`"car"`

.- spcov_initial
An object from

`spcov_initial()`

specifying initial and/or known values for the spatial covariance parameters. Not required if`spcov_type`

is provided. Multiple`spcov_initial()`

objects can be provided in a list. Then`spautor()`

is called iteratively for each element and a list is returned for each model fit.- estmethod
The estimation method. Available options include

`"reml"`

for restricted maximum likelihood and`"ml"`

for maximum likelihood The default is`"reml"`

.- random
A one-sided linear formula describing the random effect structure of the model. Terms are specified to the right of the

`~ operator`

. Each term has the structure`x1 + ... + xn | g1/.../gm`

, where`x1 + ... + xn`

specifies the model for the random effects and`g1/.../gm`

is the grouping structure. Separate terms are separated by`+`

and must generally be wrapped in parentheses. Random intercepts are added to each model implicitly when at least one other variable is defined. If a random intercept is not desired, this must be explicitly defined (e.g.,`x1 + ... + xn - 1 | g1/.../gm`

). If only a random intercept is desired for a grouping structure, the random intercept must be specified as`1 | g1/.../gm`

. Note that`g1/.../gm`

is shorthand for`(1 | g1/.../gm)`

. If only random intercepts are desired and the shorthand notation is used, parentheses can be omitted.- randcov_initial
An optional object specifying initial and/or known values for the random effect variances.

- partition_factor
A one-sided linear formula with a single term specifying the partition factor. The partition factor assumes observations from different levels of the partition factor are uncorrelated.

- W
Weight matrix specifying the neighboring structure used. Not required if

`data`

is an`sf`

object wtih`POLYGON`

geometry, as`W`

is calculated internally using queen contiguity. If calculated internally,`W`

is computed using`sf::st_intersects()`

. Also not required if`data`

is an`sf`

object with`POINT`

geometry as long as`cutoff`

is specified.- row_st
A logical indicating whether row standardization be performed on

`W`

. The default is`TRUE`

.- M
`M`

matrix satisfying the car symmetry condition. The car symmetry condition states that \((I - range * W)^{-1}M\) is symmetric, where \(I\) is an identity matrix, \(range\) is a constant that controls the spatial dependence,`W`

is the weights matrix, and \(^{-1}\) represents the inverse operator.`M`

is required for car models when`W`

is provided and`row_st`

is`FALSE`

. When`M`

, is required, the default is the identity matrix.`M`

must be diagonal or given as a vector or one-column matrix assumed to be the diagonal.- range_positive
Whether the range should be constrained to be positive. The default is

`TRUE`

.- cutoff
The numeric, distance-based cutoff used to determine

`W`

when`W`

is not specified. For an`sf`

object with`POINT`

geometry, two locations are considered neighbors if the distance between them is less than or equal to`cutoff`

.- ...
Other arguments to

`stats::optim()`

.

A list with many elements that store information about
the fitted model object. If `spcov_type`

or `spcov_initial`

are
length one, the list has class `spautor`

. Many generic functions that
summarize model fit are available for `spautor`

objects, including
`AIC`

, `AICc`

, `anova`

, `augment`

, `BIC`

, `coef`

,
`cooks.distance`

, `covmatrix`

, `deviance`

, `fitted`

, `formula`

,
`glance`

, `glances`

, `hatvalues`

, `influence`

,
`labels`

, `logLik`

, `loocv`

, `model.frame`

, `model.matrix`

,
`plot`

, `predict`

, `print`

, `pseudoR2`

, `summary`

,
`terms`

, `tidy`

, `update`

, `varcomp`

, and `vcov`

. If
`spcov_type`

or `spcov_initial`

are length greater than one, the
list has class `spautor_list`

and each element in the list has class
`spautor`

. `glances`

can be used to summarize `spautor_list`

objects, and the aforementioned `spautor`

generics can be used on each
individual list element (model fit).

The spatial linear model for areal data (i.e., spatial autoregressive model)
can be written as
\(y = X \beta + \tau + \epsilon\), where \(X\) is the fixed effects design
matrix, \(\beta\) are the fixed effects, \(\tau\) is random error that is
spatially dependent, and \(\epsilon\) is random error that is spatially
independent. Together, \(\tau\) and \(\epsilon\) are modeled using
a spatial covariance function, expressed as
\(de * R + ie * I\), where \(de\) is the dependent error variance, \(R\)
is a matrix that controls the spatial dependence structure among observations,
\(ie\) is the independent error variance, and \(I\) is
an identity matrix. Note that \(de\) and \(ie\) must be non-negative while \(range\)
must be between the reciprocal of the maximum
eigenvalue of `W`

and the reciprocal of the minimum eigenvalue of
`W`

.

`spcov_type`

Details: Parametric forms for \(R\) are given below:

car: \((I - range * W)^{-1}M\), weights matrix \(W\), symmetry condition matrix \(M\)

sar: \([(I - range * W)(I - range * W)^T]^{-1}\), weights matrix \(W\), \(^T\) indicates matrix transpose

If there are observations with no neighbors, they are given a unique variance
parameter called `extra`

, which must be non-negative.

`estmethod`

Details: The various estimation methods are

`reml`

: Maximize the restricted log-likelihood.`ml`

: Maximize the log-likelihood.

By default, all spatial covariance parameters except `ie`

as well as all random effect variance parameters
are assumed unknown, requiring estimation. `ie`

is assumed zero and known by default
(in contrast to models fit using `splm()`

, where `ie`

is assumed
unknown by default). To change this default behavior, specify `spcov_initial`

(an `NA`

value for `ie`

in `spcov_initial`

to assume
`ie`

is unknown, requiring estimation).

`random`

Details: If random effects are used, the model
can be written as \(y = X \beta + Z1u1 + ... Zjuj + \tau + \epsilon\),
where each Z is a random effects design matrix and each u is a random effect.

`partition_factor`

Details: The partition factor can be represented in matrix form as \(P\), where
elements of \(P\) equal one for observations in the same level of the partition
factor and zero otherwise. The covariance matrix involving only the
spatial and random effects components is then multiplied element-wise
(Hadmard product) by \(P\), yielding the final covariance matrix.

Observations with `NA`

response values are removed for model
fitting, but their values can be predicted afterwards by running
`predict(object)`

. This is the only way to perform prediction for
`spautor()`

models (i.e., the prediction locations must be known prior
to estimation).

This function does not perform any internal scaling. If optimization is not stable due to large extremely large variances, scale relevant variables so they have variance 1 before optimization.

```
spmod <- spautor(log_trend ~ 1, data = seal, spcov_type = "car")
summary(spmod)
#>
#> Call:
#> spautor(formula = log_trend ~ 1, data = seal, spcov_type = "car")
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.34455 -0.10417 0.04410 0.07338 0.20475
#>
#> Coefficients (fixed):
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -0.07090 0.02497 -2.839 0.00452 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Coefficients (car spatial covariance):
#> de range extra
#> 0.03252 0.42037 0.02177
```