Skip to contents

Compute the empirical semivariogram for varying bin sizes and cutoff values.

Usage

esv(
  formula,
  data,
  xcoord,
  ycoord,
  dist_matrix,
  bins = 15,
  cutoff,
  partition_factor
)

Arguments

formula

A formula describing the fixed effect structure.

data

A data frame or sf object containing the variables in formula and geographic information.

xcoord

Name of the variable in data representing the x-coordinate. Can be quoted or unquoted. Not required if data is an sf object.

ycoord

Name of the variable in data representing the y-coordinate. Can be quoted or unquoted. Not required if data is an sf object.

dist_matrix

A distance matrix to be used instead of providing coordinate names.

bins

The number of equally spaced bins. The default is 15.

cutoff

The maximum distance considered. The default is half the diagonal of the bounding box from the coordinates.

partition_factor

An optional formula specifying the partition factor. If specified, semivariances are only computed for observations sharing the same level of the partition factor.

Value

A data frame with distance bins (bins), the average distance (dist), the semivariance (gamma), and the number of (unique) pairs (np).

Details

The empirical semivariogram is a tool used to visualize and model spatial dependence by estimating the semivariance of a process at varying distances. For a constant-mean process, the semivariance at distance \(h\) is denoted \(\gamma(h)\) and defined as \(0.5 * Var(z1 - z2)\). Under second-order stationarity, \(\gamma(h) = Cov(0) - Cov(h)\), where \(Cov(h)\) is the covariance function at distance h. Typically the residuals from an ordinary least squares fit defined by formula are second-order stationary with mean zero. These residuals are used to compute the empirical semivariogram. At a distance h, the empirical semivariance is \(1/N(h) \sum (r1 - r2)^2\), where \(N(h)\) is the number of (unique) pairs in the set of observations whose distance separation is h and r1 and r2 are residuals corresponding to observations whose distance separation is h. In spmodel, these distance bins actually contain observations whose distance separation is h +- c, where c is a constant determined implicitly by bins. Typically, only observations whose distance separation is below some cutoff are used to compute the empirical semivariogram (this cutoff is determined by cutoff).

When using splm() with estmethod as "sv-wls", the empirical semivariogram is calculated internally and used to estimate spatial covariance parameters.

Examples

esv(sulfate ~ 1, sulfate)
#>                   bins      dist     gamma   np
#> 1          (0,1.5e+05]  103340.3  18.04594  149
#> 2   (1.5e+05,3.01e+05]  232013.8  20.28099  456
#> 3  (3.01e+05,4.51e+05]  379254.7  27.63260  749
#> 4  (4.51e+05,6.02e+05]  529542.7  31.65651  887
#> 5  (6.02e+05,7.52e+05]  677949.1  43.28972  918
#> 6  (7.52e+05,9.03e+05]  826916.7  41.26845 1113
#> 7  (9.03e+05,1.05e+06]  978773.3  46.58159 1161
#> 8   (1.05e+06,1.2e+06] 1127232.1  51.05177 1230
#> 9   (1.2e+06,1.35e+06] 1275414.7  58.81009 1239
#> 10  (1.35e+06,1.5e+06] 1429183.9  71.88921 1236
#> 11  (1.5e+06,1.65e+06] 1577636.1  79.03967 1139
#> 12 (1.65e+06,1.81e+06] 1729098.3  94.49986 1047
#> 13 (1.81e+06,1.96e+06] 1879678.7  99.49936  934
#> 14 (1.96e+06,2.11e+06] 2029566.3 113.57088  842
#> 15 (2.11e+06,2.26e+06] 2181336.7 125.05567  788