Gaussian Copulas for Large Spatial Fields

Modeling Data-Level Spatial Dependence in Multivariate Generalized Extreme Value Distributions

Brynjólfur Gauti Guðrúnar Jónsson

University of Iceland

Introduction

UKCP Local Projections on a 5km grid over the UK (1980-2080) [1]
Challenge: Modeling maximum daily precipitation in yearly blocks
- 43,920 spatial locations on a 180 x 244 grid
- Four parameters per location as in [2]
  - Location, Trend, Scale, Shape
Two aspects of spatial dependence:
1. GEV parameters (ICAR models)
2. Data-level dependence (Copulas)

Calculating Multivariate Normal Densities

\[ \log f(\mathbf{x}) \propto \frac{1}{2}\left(\log |\mathbf{Q}| - \mathbf{x}^T\mathbf{Q}\mathbf{x}\right) \]

Computational challenges

Log Determinant: \(\log |\mathbf{Q}|\)
- Constant for a given precision matrix
Quadratic Form: \(\mathbf{x}^T\mathbf{Q}\mathbf{x}\)
- Needs calculation for each density evaluation

Spatial Model Considerations

Some models (e.g., ICAR) avoid log determinant calculation
Efficient computation crucial for large-scale applications
Fast algorithms when \(\mathbf{Q}\) is sparse [3–4]

Spatial Models

Conditional Autoregression (CAR) [5]

\(\mathbf{D}\) is a diagonal matrix with \(D_{ii} = n_i\), the number of neighbours of \(i\)
\(\mathbf{A}\) is the adjacency matrix with \(A_{ij} = A_{ji} = 1\) if \(i \sim j\)

\[ \begin{aligned} \mathbf{x} &\sim N(\mathbf{0}, \tau \mathbf{Q}) \\ \mathbf{Q} &= \mathbf{D}\left(\mathbf{I} - \alpha \mathbf{A} \right) \end{aligned} \]

Intrinsic Conditional Autoregression (ICAR) [6]

\(\alpha = 1\), so \(\mathbf Q\) is singular, but constant
Don’t have to calculate \(\log |\mathbf{Q}|\)

\[ \begin{aligned} \mathbf{x} &\sim N(\mathbf{0}, \tau \mathbf{Q}) \\ \mathbf{Q} &= \mathbf{D} - \mathbf{A} \end{aligned} \]

BYM (Besag-York-Mollié) Model [6]

\(\mathbf{u}\) is the structured spatial component (Besag model)
\(\mathbf{v}\) is the unstructured component (i.i.d. normal)

\[ \begin{aligned} \mathbf{x} &= \mathbf{u} + \mathbf{v} \\ \mathbf{u} &\sim \mathrm{ICAR}(\tau_u) \\ \mathbf{v} &\sim N(\mathbf{0}, \tau_v^{-1}) \end{aligned} \]

BYM2 Model [7–8]

\(\rho\) models how much of variance is spatial
\(s\) is a scaling factor chosen to make \(\mathrm{Var}(\mathbf u_i) \approx 1\)

\[ \begin{aligned} \mathbf{x} &= \left(\left(\sqrt{\rho/s}\right)\mathbf{u} + \left(\sqrt{1 - \rho}\right) \mathbf{v} \right)\sigma \\ \mathbf{u} &\sim \mathrm{ICAR}(1) \\ \mathbf{v} &\sim N(\mathbf{0}, n) \end{aligned} \]

Spatial Modeling on Parameter-level [9]

\(\mu\): location parameter
- \(\mu = \mu_0 \left(1 + \Delta \left(t - t_0\right)\right)\)
\(\sigma\): scale parameter
\(\xi\): shape parameter \[ \begin{aligned} \log(\mu_0) = \psi &\sim \mathrm{BYM2}(\mu_\psi, \rho_\psi, \sigma_\psi) \\ \log(\mu_0) - \log(\sigma) = \tau &\sim \mathrm{BYM2}(\mu_\tau, \rho_\tau, \sigma_\tau) \\ f_\xi(\xi) = \phi &\sim \mathrm{BYM2}(\mu_\phi, \rho_\phi, \sigma_\phi) \\ f_\Delta(\Delta) = \gamma &\sim \mathrm{BYM2}(\mu_\gamma, \rho_\gamma, \sigma_\gamma) \end{aligned} \]

From Parameter-level to Data-level Dependence

Parameter-level Dependence

Assumes conditional independence
Biased joint probability estimates
Underestimates parameter variance

Copula

Improves joint probabilities
Enhances spatial risk assessment
Better variance estimates

Sklar’s Theorem: For any multivariate distribution \(H\), there exists a unique copula \(C\) such that:

\[ H(\mathbf x) = C(F_1(x_1), \dots, F_d(x_d)) \]

where \(F_i\) are marginal distributions. We can also write this as a density

\[ h(x) = c(F_1(x_1), \dots, F_d(x_d)) \prod_{i=1}^d f_i(x_i) \]

Our Approach: Matérn-like Gaussian Copula

\[ \begin{gathered} \log h(\mathbf x) = \log c\left(F_1(x_1), \dots, F_d(x_d)\right) + \sum_{i=1}^d \log f_i(x_i) \end{gathered} \]

Marginal CDFs

\(F_i(x_i)\) is \(\mathrm{GEV}(\mu_i, \sigma_i, \xi_i)\)
Can model parameter dependence with BYM2

\[ \begin{aligned} \log h(\mathbf x) &= \log c(u_1, \dots, u_d) \\ &+ \sum_{i=1}^d \log f_{\mathrm{GEV}}(x_i \vert \mu_i, \sigma_i, \xi_i) \\ u_i &= F_{\mathrm{GEV}}(x_i \vert \mu_i, \sigma_i, \xi_i) \end{aligned} \]

Gaussian Copula

Matérn-like precision matrix \(\mathbf{Q}\) [10]
If \(\mathbf{Q} = \mathbf{I}\) simplifies to independent margins
Scaled so \(\boldsymbol{\Sigma} = \mathbf{Q}^{-1}\) is correlation matrix
Need to calculate marginal variances [11–13]
How to generate, scale and compute with \(\mathbf{Q}\) quickly (for MCMC)?

\[ \begin{aligned} \log c(\mathbf u) &\propto \frac{1}{2}\left(\log |\mathbf{Q}| - \mathbf{z}^T\mathbf{Q}\mathbf{z} + \mathbf{z}^T\mathbf{z}\right) \\ \mathbf{z} &= \Phi^{-1}(\mathbf u) \end{aligned} \]

The Precision Matrix

\(\mathbf Q\) defined as Kronecker sum of two AR(1) precision matrices, similar to [10]

\[ \mathbf{Q} = \left( \mathbf{Q}_{\rho_1} \otimes \mathbf{I_{n_2}} + \mathbf{I_{n_1}} \otimes \mathbf{Q}_{\rho_2} \right)^{\nu + 1}, \quad \nu \in \{0, 1, 2\} \]

\[ \mathbf{Q}_{\rho_{1}} = \frac{1}{1-\rho_{1}^2} \begin{bmatrix} 1 & -\rho_{1} & 0 & \cdots & 0 \\ -\rho_{1} & 1+\rho_{1}^2 & -\rho_{1} & \cdots & 0 \\ 0 & -\rho_{1} & 1+\rho_{1}^2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix} \]

\[ \mathbf{Q}_{\rho_{2}} = \frac{1}{1-\rho_{2}^2} \begin{bmatrix} 1 & -\rho_{2} & 0 & \cdots & 0 \\ -\rho_{2} & 1+\rho_{2}^2 & -\rho_{2} & \cdots & 0 \\ 0 & -\rho_{2} & 1+\rho_{2}^2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix} \]

\[ \mathbf Q = \begin{bmatrix} \frac{1}{(1-\rho_1^2)}\mathbf{I_{n_2}} + \mathbf{Q_{\rho_2}} & \frac{-\rho_1}{(1-\rho_1^2)}\mathbf{I_{n_2}} & \dots & \cdots & \dots \\ \frac{-\rho_1}{(1-\rho_1^2)}\mathbf{I_{n_2}} & \frac{(1+\rho_1^2)}{(1-\rho_1^2)}\mathbf{I_{n_2}} + \mathbf{Q_{\rho_2}} & \frac{-\rho_1}{(1-\rho_1^2)} \mathbf{I_{n_2}} & \cdots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \vdots \\ \dots & \dots & \cdots & \frac{-\rho_1}{(1-\rho_1^2)} \mathbf{I_{n_2}} & \frac{1}{(1-\rho_1^2)}\mathbf{I_{n_2}} + \mathbf{Q_{\rho_2}} \end{bmatrix}^{\nu + 1} \]

Eigendecomposition

Because of how \(\mathbf{Q}\) is defined [14], we know that

\[ \begin{aligned} \mathbf{Q} &= \mathbf{V}\boldsymbol{\Lambda}\mathbf{V} \\ &= (\mathbf{V_{\rho_1}} \otimes \mathbf{V_{\rho_2}})(\boldsymbol \Lambda_{\rho_1} \otimes \mathbf{I} + \mathbf{I} \otimes \boldsymbol \Lambda_{\rho_2})^{\nu + 1}(\mathbf{V_{\rho_1}} \otimes \mathbf{V_{\rho_2}})^T \end{aligned} \]

where

\[ \begin{aligned} \mathbf{Q}_{\rho_1} = \mathbf{V_{\rho_1}}\boldsymbol \Lambda_{\rho_1}\mathbf{V_{\rho_1}}^T \qquad \& \qquad \mathbf{Q}_{\rho_2} = \mathbf{V_{\rho_2}}\boldsymbol \Lambda_{\rho_2}\mathbf{V_{\rho_2}}^T \end{aligned} \]

Spectral decomposition defined by value/vector pairs of smaller matrices

\[ \left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j \]

\[ \left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j \]

Problem: \(\boldsymbol \Sigma_{ii} = \left(\mathbf Q^{-1} \right)_{ii} \neq 1\)
Solution: \(\mathbf{\widetilde Q} = \mathbf{D}\mathbf{Q}\mathbf{D}\), where \(\mathbf D_{ii} = \sqrt{\boldsymbol \Sigma_{ii}}\)

Marginal Standard Deviations

\[ \boldsymbol \Sigma = \mathbf Q^{-1} = (\mathbf{V}\boldsymbol\Lambda\mathbf{V}^T)^{-1} = \mathbf{V}\boldsymbol \Lambda^{-1}\mathbf{V} \]

We know that if \(A = BC\) then \(A_{ii} = B_{i, .} C_{., i}\), so

\[ \boldsymbol \Sigma_{ii} = \sum_{k=1}^{n} v_{ik} \frac{1}{\lambda_k} (v^T)_{ki} = \sum_{k=1}^{n} v_{ik} \frac{1}{\lambda_k} v_{ik} = \sum_{k=1}^{n} v_{ik}^2 \frac{1}{\lambda_k} \]

Compute vector \(\boldsymbol \sigma^2\) containing all marginal variances

\[ \boldsymbol \sigma^2 = \sum_{i = 1}^{n_1} \sum_{j=1}^{n_2} \frac{\left(\left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j\right)^{2}}{\quad\left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right)^{\nu+1}} \]

Marginal Standard Deviations

dim1 <- 50; dim2 <- 50
rho1 <- 0.5; rho2 <- 0.3
nu <- 2

Q1 <- make_AR_prec_matrix(dim1, rho1)
Q2 <- make_AR_prec_matrix(dim2, rho2)

I1 <- Matrix::Diagonal(dim1)
I2 <- Matrix::Diagonal(dim2)

Q <- temp <- kronecker(Q1, I2) + kronecker(I1, Q2)
for (i in seq_len(nu)) Q <- Q %*% temp

msd <- function(Q1, Q2) {

  E1 <- eigen(Q1)
  E2 <- eigen(Q2)

  marginal_sd_eigen(
    E1$values, E1$vectors, dim1,
    E2$values, E2$vectors, dim2,
    nu
  ) |> 
  sort()
}

bench::mark(
  "solve" = solve(Q) |> diag() |> sqrt() |> sort(),
  "inla.qinv" = inla.qinv(Q) |> diag() |> sqrt() |> sort(),
  "marginal_sd_eigen" = msd(Q1, Q2),
  iterations = 10,
  filter_gc = FALSE 
)

# A tibble: 3 × 6
  expression             min   median `itr/sec` mem_alloc `gc/sec`
  <bch:expr>        <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
1 solve                1.27s     1.3s     0.755   78.17MB    0.755
2 inla.qinv         378.61ms  395.8ms     2.49     4.35MB    0    
3 marginal_sd_eigen   1.36ms   1.54ms   574.     649.35KB    0

Calculating the (non-copula) density

The Gaussian log pdf is \[ \log f(\mathbf{u} \vert \mathbf{Q}) \propto \frac{1}{2}\left(\log|\mathbf{Q}| - \mathbf{z}^T\mathbf{Q}\mathbf{z}\right) \]

Without scaling of \(\mathbf Q\) we get

\[ \log|\mathbf{Q}| = \sum_{k=1}^{n_1n_2}\log\lambda_k = \sum_{i=1}^{n_1}\sum_{j=2}^{n_2} \log\left[\left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right)^{\nu + 1}\right] \]

\[ \mathbf{z}^T\mathbf{Q}\mathbf{z} = \sum_{k=1}^{n_1n_2}\lambda_k \left(v_k^T\mathbf z\right)^2 = \sum_{i=1}^{n_1}\sum_{j=2}^{n_2} \left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right) \left[\left(\left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j\right)^T\mathbf z\right]^2 \]

Calculating the copula density

Let \(\mathbf v = \left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j\) and \(\lambda = \left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right)^{\nu + 1}\). Normalise \(\mathbf v\) and \(\lambda\) with

\[ \begin{gathered} \widetilde{\mathbf{v}} = \frac{\sigma \odot \mathbf{v}}{\vert\vert \sigma \odot\mathbf{v}\vert\vert_2}, \qquad \widetilde{\lambda} = \vert\vert \sigma \odot\mathbf{v}\vert\vert_2^2 \cdot \lambda \end{gathered} \]

Then \(\widetilde{\mathbf{v}}\) and \(\widetilde{\lambda}\) are an eigenvector/value pair of the scaled precision matrix \(\mathbf{\widetilde{Q}}\). Iterate over \(i\) and \(j\) to calculate

\[ \log c(\mathbf{u} \vert \mathbf{\widetilde{Q}}) = \frac{1}{2}\log|\mathbf{\widetilde Q}| - \frac{1}{2}\mathbf{z}^T\mathbf{\widetilde Q}\mathbf{z} + \frac{1}{2}\mathbf{z}^T\mathbf{z} \]

Folded Circulant Approximation

AR(1) precision

The exact form of \(Q_{\rho}\), the precision matrix of a one-dimensional AR(1) process with correlation \(\rho\)

\[ \mathbf{Q}_\rho = \frac{1}{1-\rho^2} \begin{bmatrix} 1 & -\rho & 0 & \cdots & 0 \\ -\rho & 1+\rho^2 & -\rho & \cdots & 0 \\ 0 & -\rho & 1+\rho^2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix} \]

Circulant Approximation

This approximation treats the first and last observations as neighbors, effectively wrapping the data around a circle. Very fast computation using FFT [4]

\[ \mathbf{Q}_\rho^{(circ)} = \frac{1}{1-\rho^2} \begin{bmatrix} 1+\rho^2 & -\rho & 0 & \cdots & 0 & -\rho \\ -\rho & 1+\rho^2 & -\rho & \cdots & 0 & 0 \\ 0 & -\rho & 1+\rho^2 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ -\rho & 0 & 0 & \cdots & -\rho & 1+\rho^2 \end{bmatrix} \]

Folded Circulant Approximation [15–16]

We double the data by reflecting it, giving us the data \(x_1, \dots, x_n, x_n, \dots, x_1\). We then model this doubled data with a \(2n \times 2n\) circulant matrix. Get fast computation like in circulant case, but better boundary conditions. Quadratic form written out as an \(n \times n\) matrix takes the form on the right.

\[ \mathbf{Q}_\rho^{(fold)} = \frac{1}{1-\rho^2} \begin{bmatrix} 1-\rho+\rho^2 & -\rho & 0 & \cdots & 0 & 0 \\ -\rho & 1+\rho^2 & -\rho & \cdots & 0 & 0 \\ 0 & -\rho & 1+\rho^2 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & -\rho & 1-\rho+\rho^2 \end{bmatrix} \]

Maximum Likelihood

Setup

library(stdmatern)
dim1 <- 50; dim2 <- 50
rho1 <- 0.9; rho2 <- 0.5
nu <- 1
n_obs <- 5
Z <- rmatern_copula_eigen(n_obs, dim1, dim2, rho1, rho2, nu)
U <- pnorm(Z)
Y <- qgev(U, loc = 6, scale = 2, shape = 0.1)

Log-likelihood

log_lik <- function(par, Y) {
  mu <- exp(par[1])
  sigma <- exp(par[2] + par[1])
  xi <- exp(par[3])
  rho1 <- plogis(par[4])
  rho2 <- plogis(par[5])
  u <- evd::pgev(Y, loc = mu, scale = sigma, shape = xi)
  z <- qnorm(u)
  ll_marg <- sum(evd::dgev(Y, loc = mu, scale = sigma, shape = xi, log = TRUE))
  ll_copula <- sum(dmatern_copula_eigen(z, dim1, dim2, rho1, rho2, nu))
  ll_copula + ll_marg
}

Optimize

tictoc::tic()
res <- optim(
  par = c(0, 0, 0, 0, 0),
  log_lik,
  control = list(fnscale = -1),
  Y = Y,
  hessian = TRUE,
  method = "L-BFGS-B"
)
tictoc::toc()

2.633 sec elapsed

Results

se <- sqrt(diag(solve(-res$hessian)))
ci <- res$par + c(-1.96, 1.96) * se

	Estimate	95% CI
	Estimate	Lower	Upper
μ	6.049	5.826	6.280
σ	2.027	1.866	2.201
ξ	0.111	0.092	0.135
ρ₁	0.901	0.896	0.907
ρ₂	0.497	0.473	0.521

Benchmark: Density Computations

Benchmarking how long it takes to evaluate the density of a Mátern(\(\nu\))-like field with correlation parameter \(\rho\), either unscaled or scaled to have unit marginal variance
	Unscaled			Scaled
Grid	Cholesky	Eigen		Eigen	Circulant		Folded
Grid	Cholesky	Time	Relative	Eigen	Time	Relative	Time	Relative
20x20	312.56µs	155.88µs	49.9%	235.59µs	36.2µs	15.4%	115.09µs	48.9%
40x40	1.77ms	543.76µs	30.7%	1.65ms	115.8µs	7.0%	300.9µs	18.3%
60x60	6.33ms	1.8ms	28.5%	7.1ms	188.48µs	2.7%	609.71µs	8.6%
80x80	17.98ms	5.17ms	28.8%	21.96ms	338.15µs	1.5%	1.26ms	5.7%
100x100	38.58ms	11.48ms	29.8%	48.44ms	445.14µs	0.9%	2.37ms	4.9%
120x120	81.1ms	22.74ms	28.0%	88.45ms	719.55µs	0.8%	2.82ms	3.2%
140x140	145.26ms	32.55ms	22.4%	168.38ms	965.71µs	0.6%	5.39ms	3.2%
160x160	233.03ms	54.51ms	23.4%	260.7ms	1.27ms	0.5%	5.33ms	2.0%
180x180	359.21ms	97.4ms	27.1%	482.93ms	1.61ms	0.3%	10.22ms	2.1%
200x200	567.01ms	147.51ms	26.0%	676.53ms	1.84ms	0.3%	8.62ms	1.3%
220x220	791.13ms	206.13ms	26.1%	994.11ms	2.59ms	0.3%	13.55ms	1.4%
240x240	1.07s	287ms	26.8%	1.34s	2.82ms	0.2%	14.77ms	1.1%
See https://bggj.is/materneigenpaper/ for a description of algorithms and https://github.com/bgautijonsson/stdmatern for implementations

Approximating the Correlation Matrix

Data Generation

tictoc::tic()
X <- rmatern_copula_folded_full(n = 100, dim1 = 400, dim2 = 180, rho1 = 0.8, rho2 = 0.9, nu = 2)
tictoc::toc()

1.317 sec elapsed

plot_matern(X[, 1], 400, 180)

plot_matern(X[, 2], 400, 180)

apply(X, 1, var) |> hist()

apply(X, 1, mean) |> hist()

Conclusion and Future Work

Key Results

Developed Matérn-like Gaussian copula for large spatial fields
Folded circulant approximation to the density
Achieved fast density computations
Viable for MCMC samplers

Future Work

Implement t-copulas
Apply to other environmental and climate datasets
Finish drafting paper
R package

PhD Committee

My thanks to my advisor and committee

Birgir Hrafnkelsson (PI)
Raphaël Huser
Stefan Siegert

References

Centre for Environmental Data Analysis, Met office hadley centre (2019): UKCP local projections on a 5km grid over the UK for 1980-2080. (n.d.).

Á. V. Johannesson, S. Siegert, R. Huser, H. Bakka, & B. Hrafnkelsson, Approximate bayesian inference for analysis of spatio-temporal flood frequency data. (2021). https://doi.org/10.48550/arXiv.1907.04763.

H. Rue, Fast sampling of Gaussian Markov random fields. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63 (2001) 325–338. https://doi.org/10.1111/1467-9868.00288.

H. Rue & L. Held, Gaussian Markov Random Fields: Theory and Applications (CRC Press, 2005).

J. Besag, Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological), 36 (1974) 192–236.

J. Besag, J. York, & A. Mollié, Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43 (1991) 1–20. https://doi.org/10.1007/BF00116466.

A. Riebler, S. H. Sørbye, D. Simpson, & H. Rue, An intuitive bayesian spatial model for disease mapping that accounts for scaling. (2016).

D. P. Simpson, H. Rue, T. G. Martins, A. Riebler, & S. H. Sørbye, Penalising model component complexity: A principled, practical approach to constructing priors. (2015).

M. Morris, K. Wheeler-Martin, D. Simpson, S. J. Mooney, A. Gelman, & C. DiMaggio, Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan. Spatial and Spatio-temporal Epidemiology, 31 (2019) 100301. https://doi.org/10.1016/j.sste.2019.100301.

10.

F. Lindgren, H. Rue, & J. Lindström, An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73 (2011) 423–498. https://doi.org/10.1111/j.1467-9868.2011.00777.x.

11.

H. Rue, Marginal variances for gaussian markov random fields. (2005).

12.

Hå. Rue & S. Martino, Approximate bayesian inference for hierarchical gaussian markov random field models. Journal of Statistical Planning and Inference, 137 (2007) 3177–3192. https://doi.org/10.1016/j.jspi.2006.07.016.

13.

H. Rue, S. Martino, & N. Chopin, Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71 (2009) 319–392. https://doi.org/10.1111/j.1467-9868.2008.00700.x.

14.

R. A. Horn & C. R. Johnson, Topics in matrix analysis (Cambridge: Cambridge University Press, 1991).

15.

J. T. Kent & K. V. Mardia, Spatial Analysis (John Wiley & Sons, 2022).

16.

D. Mondal, On edge correction of conditional and intrinsic autoregressions. Biometrika, 105 (2018) 447–454. https://doi.org/10.1093/biomet/asy014.