Compute the empirical semivariogram for varying bin sizes and cutoff values.
Arguments
- formula
A formula describing the fixed effect structure.
- data
A data frame or
sf
object containing the variables informula
and geographic information.- xcoord
Name of the variable in
data
representing the x-coordinate. Can be quoted or unquoted. Not required ifdata
is ansf
object.- ycoord
Name of the variable in
data
representing the y-coordinate. Can be quoted or unquoted. Not required ifdata
is ansf
object.- dist_matrix
A distance matrix to be used instead of providing coordinate names.
- bins
The number of equally spaced bins. The default is 15.
- cutoff
The maximum distance considered. The default is half the diagonal of the bounding box from the coordinates.
- partition_factor
An optional formula specifying the partition factor. If specified, semivariances are only computed for observations sharing the same level of the partition factor.
Value
A data frame with distance bins (bins
), the average distance
(dist
), the semivariance (gamma
), and the
number of (unique) pairs (np
).
Details
The empirical semivariogram is a tool used to visualize and model
spatial dependence by estimating the semivariance of a process at varying distances.
For a constant-mean process, the
semivariance at distance \(h\) is denoted \(\gamma(h)\) and defined as
\(0.5 * Var(z1 - z2)\). Under second-order stationarity,
\(\gamma(h) = Cov(0) - Cov(h)\), where \(Cov(h)\) is the covariance function at distance h
. Typically the residuals from an ordinary
least squares fit defined by formula
are second-order stationary with
mean zero. These residuals are used to compute the empirical semivariogram.
At a distance h
, the empirical semivariance is
\(1/N(h) \sum (r1 - r2)^2\), where \(N(h)\) is the number of (unique)
pairs in the set of observations whose distance separation is h
and
r1
and r2
are residuals corresponding to observations whose
distance separation is h
. In spmodel, these distance bins actually
contain observations whose distance separation is h +- c
,
where c
is a constant determined implicitly by bins
. Typically,
only observations whose distance separation is below some cutoff are used
to compute the empirical semivariogram (this cutoff is determined by cutoff
).
When using splm()
with estmethod
as "sv-wls"
, the empirical
semivariogram is calculated internally and used to estimate spatial
covariance parameters.
Examples
esv(sulfate ~ 1, sulfate)
#> bins dist gamma np
#> 1 (0,1.5e+05] 103340.3 18.04594 149
#> 2 (1.5e+05,3.01e+05] 232013.8 20.28099 456
#> 3 (3.01e+05,4.51e+05] 379254.7 27.63260 749
#> 4 (4.51e+05,6.02e+05] 529542.7 31.65651 887
#> 5 (6.02e+05,7.52e+05] 677949.1 43.28972 918
#> 6 (7.52e+05,9.03e+05] 826916.7 41.26845 1113
#> 7 (9.03e+05,1.05e+06] 978773.3 46.58159 1161
#> 8 (1.05e+06,1.2e+06] 1127232.1 51.05177 1230
#> 9 (1.2e+06,1.35e+06] 1275414.7 58.81009 1239
#> 10 (1.35e+06,1.5e+06] 1429183.9 71.88921 1236
#> 11 (1.5e+06,1.65e+06] 1577636.1 79.03967 1139
#> 12 (1.65e+06,1.81e+06] 1729098.3 94.49986 1047
#> 13 (1.81e+06,1.96e+06] 1879678.7 99.49936 934
#> 14 (1.96e+06,2.11e+06] 2029566.3 113.57088 842
#> 15 (2.11e+06,2.26e+06] 2181336.7 125.05567 788