Gaussian Copulas for Large Spatial Fields

Modeling Data-Level Spatial Dependence in Multivariate Generalized Extreme Value Distributions

Brynjólfur Gauti Guðrúnar Jónsson

University of Iceland

Example

Q <- make_AR_prec_matrix(dim = 60, rho = 0.9)
Z <- rmvn.sparse(n = 1, mu = rep(0, nrow(Q)), CH = Cholesky(Q)) |> 
  as.numeric()

Example

U <- pnorm(Z)
Y <- qgev(U, loc = 10, scale = 5, shape = 0.1)

Example

Q <- make_AR_prec_matrix(dim = 60, rho = -0.8)

What’s going on?

\(y_t\) is marginally \(\mathrm{GEV}(\mu, \sigma, \xi)\)

\[ \log f(y_t \vert \mu, \sigma, \xi) = - n\log\sigma - (1 + \frac{1}{\xi}) \sum_{i=1}^{n}{\log\left(1 + \xi\left[\frac{z_i - \mu}{\sigma} \right]\right)} - \sum_{i=1}^{n}{\left(1 + \xi \left[ \frac{z_i - \mu}{\sigma} \right]\right)}^{-1/\xi} \]

\(\mathbf Y\) has a Gaussian AR(1) copula

\[ \begin{aligned} \log c(\mathbf{u}) &\propto \frac{1}{2}\log|\mathbf{Q}| - \frac{1}{2}\mathbf{z}^T\mathbf{Q}\mathbf{z} + \frac{1}{2}\mathbf{z}^T\mathbf{z} \\ u_t &= F_{\mathrm{GEV}}(y_t \vert \mu, \sigma, \xi) \\ z_t &= \Phi^{-1}(u_t) \end{aligned} \]

How to estimate all of this?

Copulas

Sklar’s Theorem: For any multivariate distribution \(H\), there exists a unique copula \(C\) such that:

\[ H(\mathbf x) = C(F_1(x_1), \dots, F_d(x_d)) \]

where \(F_i\) are marginal distributions.

We can also write this as a (log) density

\[ \begin{aligned} h(x) &= c(F_1(x_1), \dots, F_d(x_d)) \prod_{i=1}^d f_i(x_i) \\ \log h(\mathbf x) &= \log c\left(F_1(x_1), \dots, F_d(x_d)\right) + \sum_{i=1}^d \log f_i(x_i) \end{aligned} \]

Have no fear

Stan Model

data {
  int n_obs;
  vector[n_obs] y;
}

transformed data {
  real min_y = min(y);
}

parameters {
  real<lower = -1, upper = 1> rho;
  real<lower = 0> sigma;
  real<lower = 0> xi;
  real<lower = 0, upper = min_y + sigma / xi> mu;
}

model {
  vector[n_obs] U;
  target += gev_lpdf(y | mu, sigma, xi);
  for (i in 1:n_obs) {
    U[i] = gev_cdf(y[i] | mu, sigma, xi);
  }
  target += normal_copula_ar1_lpdf(U | rho);

  // Priors
  target += std_normal_lpdf(rho);
  target += exponential_lpdf(sigma | 1);
  target += exponential_lpdf(xi | 1);
}

real normal_ar1_lpdf(vector x, real rho) {
  int N = num_elements(x);
  real out;
  real log_det = - (N - 1) * (log(1 + rho) + log(1 - rho)) / 2;
  vector[N] q;
  real scl = sqrt(1 / (1 - rho^2));
  
  q[1:(N - 1)] = scl * (x[1:(N - 1)] - rho * x[2:N]);
  q[N] = x[N];
  
  out = log_det - dot_self(q) / 2;
  
  return out;
}

real normal_copula_ar1_lpdf(vector U, real rho) {
  int N = rows(U);
  vector[N] Z = inv_Phi(U);
  return normal_ar1_lpdf(Z | rho) + dot_self(Z) / 2;
}

Introduction

UKCP Local Projections on a 5km grid over the UK (1980-2080) [1]
Challenge: Modeling maximum daily precipitation in yearly blocks
- 43,920 spatial locations on a 180 x 244 grid
- Four parameters per location as in [2]
  - Location, Trend, Scale, Shape
Two aspects of spatial dependence:
1. GEV parameters (ICAR models)
2. Data-level dependence (Copulas)

Calculating Multivariate Normal Densities

\[ \log f(\mathbf{x}) \propto \frac{1}{2}\left(\log |\mathbf{Q}| - \mathbf{x}^T\mathbf{Q}\mathbf{x}\right) \]

Computational challenges

Log Determinant: \(\log |\mathbf{Q}|\)
- Constant for a given precision matrix
Quadratic Form: \(\mathbf{x}^T\mathbf{Q}\mathbf{x}\)
- Needs calculation for each density evaluation

Spatial Model Considerations

Some models (e.g., ICAR) avoid log determinant calculation
Efficient computation crucial for large-scale applications
Fast algorithms when \(\mathbf{Q}\) is sparse [3–4]

Spatial Models

Conditional Autoregression (CAR) [5]

\(\mathbf{D}\) is a diagonal matrix with \(D_{ii} = n_i\), the number of neighbours of \(i\)
\(\mathbf{A}\) is the adjacency matrix with \(A_{ij} = A_{ji} = 1\) if \(i \sim j\)

\[ \begin{aligned} \mathbf{x} &\sim N(\mathbf{0}, \tau \mathbf{Q}) \\ \mathbf{Q} &= \mathbf{D}\left(\mathbf{I} - \alpha \mathbf{A} \right) \end{aligned} \]

Intrinsic Conditional Autoregression (ICAR) [6]

\(\alpha = 1\), so \(\mathbf Q\) is singular, but constant
Don’t have to calculate \(\log |\mathbf{Q}|\)

\[ \begin{aligned} \mathbf{x} &\sim N(\mathbf{0}, \tau \mathbf{Q}) \\ \mathbf{Q} &= \mathbf{D} - \mathbf{A} \end{aligned} \]

BYM (Besag-York-Mollié) Model [6]

\(\mathbf{u}\) is the structured spatial component (Besag model)
\(\mathbf{v}\) is the unstructured component (i.i.d. normal)

\[ \begin{aligned} \mathbf{x} &= \mathbf{u} + \mathbf{v} \\ \mathbf{u} &\sim \mathrm{ICAR}(\tau_u) \\ \mathbf{v} &\sim N(\mathbf{0}, \tau_v^{-1}) \end{aligned} \]

BYM2 Model [7–8]

\(\rho\) models how much of variance is spatial
\(s\) is a scaling factor chosen to make \(\mathrm{Var}(\mathbf u_i) \approx 1\)

\[ \begin{aligned} \mathbf{x} &= \left(\left(\sqrt{\rho/s}\right)\mathbf{u} + \left(\sqrt{1 - \rho}\right) \mathbf{v} \right)\sigma \\ \mathbf{u} &\sim \mathrm{ICAR}(1) \\ \mathbf{v} &\sim N(\mathbf{0}, n) \end{aligned} \]

Spatial Modeling on Parameter-level [9]

\(\mu\): location parameter
- \(\mu = \mu_0 \left(1 + \Delta \left(t - t_0\right)\right)\)
\(\sigma\): scale parameter
\(\xi\): shape parameter \[ \begin{aligned} \log(\mu_0) = \psi &\sim \mathrm{BYM2}(\mu_\psi, \rho_\psi, \sigma_\psi) \\ \log(\mu_0) - \log(\sigma) = \tau &\sim \mathrm{BYM2}(\mu_\tau, \rho_\tau, \sigma_\tau) \\ f_\xi(\xi) = \phi &\sim \mathrm{BYM2}(\mu_\phi, \rho_\phi, \sigma_\phi) \\ f_\Delta(\Delta) = \gamma &\sim \mathrm{BYM2}(\mu_\gamma, \rho_\gamma, \sigma_\gamma) \end{aligned} \]

Leftover Data-level Dependence

Our Approach: Matérn-like Gaussian Copula

\[ \begin{gathered} \log h(\mathbf x) = \log c\left(F_1(x_1), \dots, F_d(x_d)\right) + \sum_{i=1}^d \log f_i(x_i) \end{gathered} \]

Marginal CDFs

\(F_i(x_i)\) is \(\mathrm{GEV}(\mu_i, \sigma_i, \xi_i)\)
Can model parameter dependence with BYM2

\[ \begin{aligned} \log h(\mathbf x) &= \log c(u_1, \dots, u_d) \\ &+ \sum_{i=1}^d \log f_{\mathrm{GEV}}(x_i \vert \mu_i, \sigma_i, \xi_i) \\ u_i &= F_{\mathrm{GEV}}(x_i \vert \mu_i, \sigma_i, \xi_i) \end{aligned} \]

Gaussian Copula

Matérn-like precision matrix \(\mathbf{Q}\) [10]
If \(\mathbf{Q} = \mathbf{I}\) simplifies to independent margins
Scaled so \(\boldsymbol{\Sigma} = \mathbf{Q}^{-1}\) is correlation matrix
Need to calculate marginal variances [11–13]
How to generate, scale and compute with \(\mathbf{Q}\) quickly (for MCMC)?

\[ \begin{aligned} \log c(\mathbf u) &\propto \frac{1}{2}\left(\log |\mathbf{Q}| - \mathbf{z}^T\mathbf{Q}\mathbf{z} + \mathbf{z}^T\mathbf{z}\right) \\ \mathbf{z} &= \Phi^{-1}(\mathbf u) \end{aligned} \]

The Precision Matrix

\(\mathbf Q\) defined as Kronecker sum of two AR(1) precision matrices, similar to [10]

\[ \mathbf{Q} = \left( \mathbf{Q}_{\rho_1} \otimes \mathbf{I_{n_2}} + \mathbf{I_{n_1}} \otimes \mathbf{Q}_{\rho_2} \right)^{\nu + 1}, \quad \nu \in \{0, 1, 2\} \]

\[ \mathbf{Q}_{\rho_{1}} = \frac{1}{1-\rho_{1}^2} \begin{bmatrix} 1 & -\rho_{1} & 0 & \cdots & 0 \\ -\rho_{1} & 1+\rho_{1}^2 & -\rho_{1} & \cdots & 0 \\ 0 & -\rho_{1} & 1+\rho_{1}^2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix} \]

\[ \mathbf{Q}_{\rho_{2}} = \frac{1}{1-\rho_{2}^2} \begin{bmatrix} 1 & -\rho_{2} & 0 & \cdots & 0 \\ -\rho_{2} & 1+\rho_{2}^2 & -\rho_{2} & \cdots & 0 \\ 0 & -\rho_{2} & 1+\rho_{2}^2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix} \]

\[ \mathbf Q = \begin{bmatrix} \frac{1}{(1-\rho_1^2)}\mathbf{I_{n_2}} + \mathbf{Q_{\rho_2}} & \frac{-\rho_1}{(1-\rho_1^2)}\mathbf{I_{n_2}} & \dots & \cdots & \dots \\ \frac{-\rho_1}{(1-\rho_1^2)}\mathbf{I_{n_2}} & \frac{(1+\rho_1^2)}{(1-\rho_1^2)}\mathbf{I_{n_2}} + \mathbf{Q_{\rho_2}} & \frac{-\rho_1}{(1-\rho_1^2)} \mathbf{I_{n_2}} & \cdots & \vdots \\ \vdots & \ddots & \ddots & \ddots & \vdots \\ \dots & \dots & \cdots & \frac{-\rho_1}{(1-\rho_1^2)} \mathbf{I_{n_2}} & \frac{1}{(1-\rho_1^2)}\mathbf{I_{n_2}} + \mathbf{Q_{\rho_2}} \end{bmatrix}^{\nu + 1} \]

Eigendecomposition

Because of how \(\mathbf{Q}\) is defined [14], we know that

\[ \begin{aligned} \mathbf{Q} &= \mathbf{V}\boldsymbol{\Lambda}\mathbf{V} \\ &= (\mathbf{V_{\rho_1}} \otimes \mathbf{V_{\rho_2}})(\boldsymbol \Lambda_{\rho_1} \otimes \mathbf{I} + \mathbf{I} \otimes \boldsymbol \Lambda_{\rho_2})^{\nu + 1}(\mathbf{V_{\rho_1}} \otimes \mathbf{V_{\rho_2}})^T \end{aligned} \]

where

\[ \begin{aligned} \mathbf{Q}_{\rho_1} = \mathbf{V_{\rho_1}}\boldsymbol \Lambda_{\rho_1}\mathbf{V_{\rho_1}}^T \qquad \& \qquad \mathbf{Q}_{\rho_2} = \mathbf{V_{\rho_2}}\boldsymbol \Lambda_{\rho_2}\mathbf{V_{\rho_2}}^T \end{aligned} \]

Spectral decomposition defined by value/vector pairs of smaller matrices

\[ \left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j \]

\[ \left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j \]

Problem: \(\boldsymbol \Sigma_{ii} = \left(\mathbf Q^{-1} \right)_{ii} \neq 1\)
Solution: \(\mathbf{\widetilde Q} = \mathbf{D}\mathbf{Q}\mathbf{D}\), where \(\mathbf D_{ii} = \sqrt{\boldsymbol \Sigma_{ii}}\)

Marginal Standard Deviations

\[ \boldsymbol \Sigma = \mathbf Q^{-1} = (\mathbf{V}\boldsymbol\Lambda\mathbf{V}^T)^{-1} = \mathbf{V}\boldsymbol \Lambda^{-1}\mathbf{V} \]

We know that if \(A = BC\) then \(A_{ii} = B_{i, .} C_{., i}\), so

\[ \boldsymbol \Sigma_{ii} = \sum_{k=1}^{n} v_{ik} \frac{1}{\lambda_k} (v^T)_{ki} = \sum_{k=1}^{n} v_{ik} \frac{1}{\lambda_k} v_{ik} = \sum_{k=1}^{n} v_{ik}^2 \frac{1}{\lambda_k} \]

Compute vector \(\boldsymbol \sigma^2\) containing all marginal variances

\[ \boldsymbol \sigma^2 = \sum_{i = 1}^{n_1} \sum_{j=1}^{n_2} \frac{\left(\left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j\right)^{2}}{\quad\left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right)^{\nu+1}} \]

Marginal Standard Deviations

dim1 <- 50; dim2 <- 50
rho1 <- 0.5; rho2 <- 0.3
nu <- 2

Q1 <- make_AR_prec_matrix(dim1, rho1)
Q2 <- make_AR_prec_matrix(dim2, rho2)

I1 <- Matrix::Diagonal(dim1)
I2 <- Matrix::Diagonal(dim2)

Q <- temp <- kronecker(Q1, I2) + kronecker(I1, Q2)
for (i in seq_len(nu)) Q <- Q %*% temp

msd <- function(Q1, Q2) {

  E1 <- eigen(Q1)
  E2 <- eigen(Q2)

  marginal_sd_eigen(
    E1$values, E1$vectors, dim1,
    E2$values, E2$vectors, dim2,
    nu
  ) |> 
  sort()
}

bench::mark(
  "solve" = solve(Q) |> diag() |> sqrt() |> sort(),
  "inla.qinv" = inla.qinv(Q) |> diag() |> sqrt() |> sort(),
  "marginal_sd_eigen" = msd(Q1, Q2),
  iterations = 10,
  filter_gc = FALSE 
)

# A tibble: 3 × 6
  expression             min   median `itr/sec` mem_alloc `gc/sec`
  <bch:expr>        <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
1 solve                1.16s    1.17s     0.838   78.15MB    0.670
2 inla.qinv         377.65ms 382.19ms     2.52     4.35MB    0.252
3 marginal_sd_eigen   1.35ms   1.46ms   597.     647.36KB    0

Calculating the (non-copula) density

The Gaussian log pdf is \[ \log f(\mathbf{u} \vert \mathbf{Q}) \propto \frac{1}{2}\left(\log|\mathbf{Q}| - \mathbf{z}^T\mathbf{Q}\mathbf{z}\right) \]

Without scaling of \(\mathbf Q\) we get

\[ \log|\mathbf{Q}| = \sum_{k=1}^{n_1n_2}\log\lambda_k = \sum_{i=1}^{n_1}\sum_{j=2}^{n_2} \log\left[\left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right)^{\nu + 1}\right] \]

\[ \mathbf{z}^T\mathbf{Q}\mathbf{z} = \sum_{k=1}^{n_1n_2}\lambda_k \left(v_k^T\mathbf z\right)^2 = \sum_{i=1}^{n_1}\sum_{j=2}^{n_2} \left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right) \left[\left(\left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j\right)^T\mathbf z\right]^2 \]

Calculating the copula density

Let \(\mathbf v = \left\{\mathbf{v}_{\rho_1}\right\}_i \otimes \left\{\mathbf{v}_{\rho_2}\right\}_j\) and \(\lambda = \left(\left\{\lambda_{\rho_1}\right\}_i + \left\{\lambda_{\rho_2}\right\}_j\right)^{\nu + 1}\). Normalise \(\mathbf v\) and \(\lambda\) with

\[ \begin{gathered} \widetilde{\mathbf{v}} = \frac{\sigma \odot \mathbf{v}}{\vert\vert \sigma \odot\mathbf{v}\vert\vert_2}, \qquad \widetilde{\lambda} = \vert\vert \sigma \odot\mathbf{v}\vert\vert_2^2 \cdot \lambda \end{gathered} \]

Then \(\widetilde{\mathbf{v}}\) and \(\widetilde{\lambda}\) are an eigenvector/value pair of the scaled precision matrix \(\mathbf{\widetilde{Q}}\). Iterate over \(i\) and \(j\) to calculate

\[ \log c(\mathbf{u} \vert \mathbf{\widetilde{Q}}) = \frac{1}{2}\log|\mathbf{\widetilde Q}| - \frac{1}{2}\mathbf{z}^T\mathbf{\widetilde Q}\mathbf{z} + \frac{1}{2}\mathbf{z}^T\mathbf{z} \]

Folded Circulant Approximation

AR(1) precision

The exact form of \(Q_{\rho}\), the precision matrix of a one-dimensional AR(1) process with correlation \(\rho\)

\[ \mathbf{Q}_\rho = \frac{1}{1-\rho^2} \begin{bmatrix} 1 & -\rho & 0 & \cdots & 0 \\ -\rho & 1+\rho^2 & -\rho & \cdots & 0 \\ 0 & -\rho & 1+\rho^2 & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & 1 \end{bmatrix} \]

Circulant Approximation

This approximation treats the first and last observations as neighbors, effectively wrapping the data around a circle. Very fast computation using FFT [4]

\[ \mathbf{Q}_\rho^{(circ)} = \frac{1}{1-\rho^2} \begin{bmatrix} 1+\rho^2 & -\rho & 0 & \cdots & 0 & -\rho \\ -\rho & 1+\rho^2 & -\rho & \cdots & 0 & 0 \\ 0 & -\rho & 1+\rho^2 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ -\rho & 0 & 0 & \cdots & -\rho & 1+\rho^2 \end{bmatrix} \]

Folded Circulant Approximation [15–16]

We double the data by reflecting it, giving us the data \(x_1, \dots, x_n, x_n, \dots, x_1\). We then model this doubled data with a \(2n \times 2n\) circulant matrix. Get fast computation like in circulant case, but better boundary conditions. Quadratic form written out as an \(n \times n\) matrix takes the form on the right.

\[ \mathbf{Q}_\rho^{(fold)} = \frac{1}{1-\rho^2} \begin{bmatrix} 1-\rho+\rho^2 & -\rho & 0 & \cdots & 0 & 0 \\ -\rho & 1+\rho^2 & -\rho & \cdots & 0 & 0 \\ 0 & -\rho & 1+\rho^2 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & -\rho & 1-\rho+\rho^2 \end{bmatrix} \]

Exact Stan Model

real matern_copula_exact_lpdf(matrix Z, int dim1, real rho1, int dim2, real rho2, int nu) {
  int n_obs = cols(Z);
  int D = dim1 * dim2;
  tuple(matrix[dim1, dim1], vector[dim1]) E1 = ar1_precision_eigen(dim1, rho1);
  tuple(matrix[dim2, dim2], vector[dim2]) E2 = ar1_precision_eigen(dim2, rho2);

  real log_det = 0;
  real quadform_sum = 0;

  vector[D] marginal_sds = marginal_sd(E1, E2, nu);

  for (i in 1:dim1) {
    for (j in 1:dim2) {
      vector[D] v = kronecker(E1.1[, i], E2.1[, j]);
      v = v .* marginal_sds;  
      real norm_v = sqrt(sum(square(v)));
      v /= norm_v;  
      
      real lambda = pow(E1.2[i] + E2.2[j], nu + 1) * square(norm_v);
      log_det += log(lambda);
      
      row_vector[n_obs] q = v' * Z;  
      quadform_sum += dot_self(q) * lambda;
    }
  }

  real z_squared = sum(columns_dot_self(Z));

  return -0.5 * (quadform_sum - n_obs * log_det - z_squared);
}

PSIS-LOO-CV

vector matern_cond_loglik(vector y, int dim1, real rho1, int dim2, real rho2, int nu) {
  int D = dim1 * dim2;
  tuple(matrix[dim1, dim1], vector[dim1]) E1 = ar1_precision_eigen(dim1, rho1);
  tuple(matrix[dim2, dim2], vector[dim2]) E2 = ar1_precision_eigen(dim2, rho2);

  vector[D] marginal_sds = marginal_sd(E1, E2, nu);
  vector[D] g = rep_vector(0, D);
  vector[D] tau_tilde = rep_vector(0, D);

  for (i in 1:dim1) {
    for (j in 1:dim2) {
      vector[D] v = kronecker(E1.1[, i], E2.1[, j]);
      v = v .* marginal_sds;
      real norm_v = sqrt(sum(square(v)));
      v /= norm_v;

      real lambda = pow(E1.2[i] + E2.2[j], nu + 1) * square(norm_v);
      
      g += v * lambda * v' * y;
      tau_tilde += square(v) * lambda;
  }
}

Approximation

vector fold_data(vector x, int n1, int n2) {
  vector[4 * n1 * n2] folded;
  for (i in 1:n1) {
    for (j in 1:n2) {
      int idx = (i - 1) * n2 + j;
      folded[(i - 1) * 2 * n2 + j] = x[idx];
      folded[(i - 1) * 2 * n2 + (2 * n2 - j + 1)] = x[idx];
      folded[(2 * n1 - i) * 2 * n2 + j] = x[idx];
      folded[(2 * n1 - i) * 2 * n2 + (2 * n2 - j + 1)] = x[idx];
    }
  }
  return folded;
}

complex_matrix create_base_matrix_and_rescale_eigenvalues(int dim1, int dim2, real rho1, real rho2, int nu) {
  
  matrix[dim2, dim1] c = make_base_matrix(dim1, dim2, rho1, rho2);

  // Compute the eigenvalues and marginal standard deviation
  complex_matrix[dim2, dim1] eigs = pow(fft2(c), (nu + 1.0));
  complex_matrix[dim2, dim1] inv_eigs = pow(eigs, -1);
  real mvar = get_real(inv_fft2(inv_eigs)[1, 1]);
  eigs *= mvar;
  
  return eigs;
}

real matern_folded_copula_lpdf(matrix Z, int dim1, int dim2, real rho1, real rho2, int nu) {
  int n_obs = cols(Z);
  complex_matrix[2 * dim2, 2 * dim1] eigs = create_base_matrix_and_rescale_eigenvalues(2 * dim1, 2 * dim2, rho1, rho2, nu);
  real quad_forms = 0;
  real log_det = sum(log(get_real(eigs)));
  for (i in 1:n_obs) {
    vector[4 * dim1 * dim2] Z_fold = fold_data(Z[, i], dim1, dim2);
    vector[4 * dim1 * dim2] Qz = matvec_prod(eigs, Z_fold);
    quad_forms += dot_product(Z_fold, Qz) - dot_self(Z_fold);
  } 
  return - 0.5 * (quad_forms - n_obs * log_det);
}

Data Generation

tictoc::tic()
X <- rmatern_copula_folded_full(n = 100, dim1 = 400, dim2 = 180, rho1 = 0.8, rho2 = 0.9, nu = 2)
tictoc::toc()

1.232 sec elapsed

plot_matern(X[, 1], 400, 180)

plot_matern(X[, 2], 400, 180)

apply(X, 1, var) |> hist()

apply(X, 1, mean) |> hist()

Maximum Likelihood

Setup

library(stdmatern)
dim1 <- 50; dim2 <- 50
rho1 <- 0.9; rho2 <- 0.5
nu <- 1
n_obs <- 5
Z <- rmatern_copula_eigen(n_obs, dim1, dim2, rho1, rho2, nu)
U <- pnorm(Z)
Y <- qgev(U, loc = 6, scale = 2, shape = 0.1)

Log-likelihood

log_lik <- function(par, Y) {
  mu <- exp(par[1])
  sigma <- exp(par[2] + par[1])
  xi <- exp(par[3])
  rho1 <- plogis(par[4])
  rho2 <- plogis(par[5])
  u <- evd::pgev(Y, loc = mu, scale = sigma, shape = xi)
  z <- qnorm(u)
  ll_marg <- sum(evd::dgev(Y, loc = mu, scale = sigma, shape = xi, log = TRUE))
  ll_copula <- sum(dmatern_copula_eigen(z, dim1, dim2, rho1, rho2, nu))
  ll_copula + ll_marg
}

Optimize

tictoc::tic()
res <- optim(
  par = c(0, 0, 0, 0, 0),
  log_lik,
  control = list(fnscale = -1),
  Y = Y,
  hessian = TRUE,
  method = "L-BFGS-B"
)
tictoc::toc()

2.349 sec elapsed

Results

se <- sqrt(diag(solve(-res$hessian)))
ci <- res$par + c(-1.96, 1.96) * se

	Estimate	95% CI
	Estimate	Lower	Upper
μ	6.164	5.925	6.412
σ	2.126	1.952	2.316
ξ	0.123	0.103	0.148
ρ₁	0.901	0.895	0.906
ρ₂	0.506	0.482	0.530

Benchmark: Density Computations

Benchmarking how long it takes to evaluate the density of a Mátern(\(\nu\))-like field with correlation parameter \(\rho\), either unscaled or scaled to have unit marginal variance
	Unscaled			Scaled
Grid	Cholesky	Eigen		Eigen	Circulant		Folded
Grid	Cholesky	Time	Relative	Eigen	Time	Relative	Time	Relative
20x20	312.56µs	155.88µs	49.9%	235.59µs	36.2µs	15.4%	115.09µs	48.9%
40x40	1.77ms	543.76µs	30.7%	1.65ms	115.8µs	7.0%	300.9µs	18.3%
60x60	6.33ms	1.8ms	28.5%	7.1ms	188.48µs	2.7%	609.71µs	8.6%
80x80	17.98ms	5.17ms	28.8%	21.96ms	338.15µs	1.5%	1.26ms	5.7%
100x100	38.58ms	11.48ms	29.8%	48.44ms	445.14µs	0.9%	2.37ms	4.9%
120x120	81.1ms	22.74ms	28.0%	88.45ms	719.55µs	0.8%	2.82ms	3.2%
140x140	145.26ms	32.55ms	22.4%	168.38ms	965.71µs	0.6%	5.39ms	3.2%
160x160	233.03ms	54.51ms	23.4%	260.7ms	1.27ms	0.5%	5.33ms	2.0%
180x180	359.21ms	97.4ms	27.1%	482.93ms	1.61ms	0.3%	10.22ms	2.1%
200x200	567.01ms	147.51ms	26.0%	676.53ms	1.84ms	0.3%	8.62ms	1.3%
220x220	791.13ms	206.13ms	26.1%	994.11ms	2.59ms	0.3%	13.55ms	1.4%
240x240	1.07s	287ms	26.8%	1.34s	2.82ms	0.2%	14.77ms	1.1%
See https://bggj.is/materneigenpaper/ for a description of algorithms and https://github.com/bgautijonsson/stdmatern for implementations

Approximating the Correlation Matrix

Conclusion and Future Work

Key Results

Developed Matérn-like Gaussian copula for large spatial fields
Folded circulant approximation to the density
Achieved fast density computations
Viable for MCMC samplers

Future Work

Implement t-copulas
Apply to other environmental and climate datasets
Finish drafting paper
R package

PhD Committee

My thanks to my advisor and committee

Birgir Hrafnkelsson (PI)
Raphaël Huser
Stefan Siegert

References

Centre for Environmental Data Analysis, Met office hadley centre (2019): UKCP local projections on a 5km grid over the UK for 1980-2080. (n.d.).

Á. V. Johannesson, S. Siegert, R. Huser, H. Bakka, & B. Hrafnkelsson, Approximate bayesian inference for analysis of spatio-temporal flood frequency data. (2021). https://doi.org/10.48550/arXiv.1907.04763.

H. Rue, Fast sampling of Gaussian Markov random fields. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63 (2001) 325–338. https://doi.org/10.1111/1467-9868.00288.

H. Rue & L. Held, Gaussian Markov Random Fields: Theory and Applications (CRC Press, 2005).

J. Besag, Spatial interaction and the statistical analysis of lattice systems. Journal of the Royal Statistical Society. Series B (Methodological), 36 (1974) 192–236.

J. Besag, J. York, & A. Mollié, Bayesian image restoration, with two applications in spatial statistics. Annals of the Institute of Statistical Mathematics, 43 (1991) 1–20. https://doi.org/10.1007/BF00116466.

A. Riebler, S. H. Sørbye, D. Simpson, & H. Rue, An intuitive bayesian spatial model for disease mapping that accounts for scaling. (2016).

D. P. Simpson, H. Rue, T. G. Martins, A. Riebler, & S. H. Sørbye, Penalising model component complexity: A principled, practical approach to constructing priors. (2015).

M. Morris, K. Wheeler-Martin, D. Simpson, S. J. Mooney, A. Gelman, & C. DiMaggio, Bayesian hierarchical spatial models: Implementing the Besag York Mollié model in stan. Spatial and Spatio-temporal Epidemiology, 31 (2019) 100301. https://doi.org/10.1016/j.sste.2019.100301.

10.

F. Lindgren, H. Rue, & J. Lindström, An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73 (2011) 423–498. https://doi.org/10.1111/j.1467-9868.2011.00777.x.

11.

H. Rue, Marginal variances for gaussian markov random fields. (2005).

12.

Hå. Rue & S. Martino, Approximate bayesian inference for hierarchical gaussian markov random field models. Journal of Statistical Planning and Inference, 137 (2007) 3177–3192. https://doi.org/10.1016/j.jspi.2006.07.016.

13.

H. Rue, S. Martino, & N. Chopin, Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71 (2009) 319–392. https://doi.org/10.1111/j.1467-9868.2008.00700.x.

14.

R. A. Horn & C. R. Johnson, Topics in matrix analysis (Cambridge: Cambridge University Press, 1991).

15.

J. T. Kent & K. V. Mardia, Spatial Analysis (John Wiley & Sons, 2022).

16.

D. Mondal, On edge correction of conditional and intrinsic autoregressions. Biometrika, 105 (2018) 447–454. https://doi.org/10.1093/biomet/asy014.