Skip to contents

Introduction

In this vignette we will introduce how to fit Whittle–Matérn fields with general smoothness based on finite element and rational approximations. The theory for this approach is provided in Bolin et al. (2023) and Bolin, Simas, and Xiong (2023). For the implementation, we make use of the rSPDE package for the rational approximations.

These models are thus implemented using finite element approximations. Such approximations are not needed for integer smoothness parameters, and for the details about the exact models we refer to the vignettes

For details on the construction of metric graphs, see Working with metric graphs

For further details on data manipulation on metric graphs, see Data manipulation on metric graphs

Constructing the graph and the mesh

We begin by loading the rSPDE and MetricGraph packages:

As an example, we consider the following metric graph

  edge1 <- rbind(c(0,0),c(1,0))
  edge2 <- rbind(c(0,0),c(0,1))
  edge3 <- rbind(c(0,1),c(-1,1))
  theta <- seq(from=pi,to=3*pi/2,length.out = 20)
  edge4 <- cbind(sin(theta),1+ cos(theta))
  edges = list(edge1, edge2, edge3, edge4)
  graph <- metric_graph$new(edges = edges)
  graph$plot()

To construct a FEM approximation of a Whittle–Matérn field with general smoothness, we must first construct a mesh on the graph.

  graph$build_mesh(h = 0.5)
  graph$plot(mesh=TRUE)

In the command build_mesh, the argument h decides the largest spacing between nodes in the mesh. As can be seen in the plot, the mesh is very coarse, so let’s reduce the value of h and rebuild the mesh:

graph$build_mesh(h = 0.01)

We are now ready to specify the model (κ2Δ)α/2τu=𝒲 (\kappa^2 - \Delta)^{\alpha/2} \tau u = \mathcal{W} for the Whittle–Matérn field uu. For this, we use the matern.operators function from the rSPDE package:

  sigma <- 1.3
  range <- 0.15
  nu <- 0.8 

  rspde.order <- 2
  op <- matern.operators(nu = nu, range = range, sigma = sigma, 
                         parameterization = "matern",
                         m = rspde.order, graph = graph)                     

As can be seen in the code, we specify κ\kappa via the practical correlation range 8ν/κ\sqrt{8\nu}/\kappa. Also, the model is not parametrized by τ,α\tau, \alpha but instead by σ,ν\sigma, \nu. Here, sigma denotes the standard deviation of the field and nu is the smoothness parameter, which is related to α\alpha via the relation α=ν+1/2\alpha = \nu + 1/2. The object op contains the matrices needed for evaluating the distribution of the stochastic weights in the FEM approximation.

Let us simulate the field uu at the mesh locations and plot the result:

u <- simulate(op)
graph$plot_function(X = u, plotly = TRUE)

If we want to evaluate u(s)u(s) at some locations s1,,sns_1,\ldots, s_n, we need to multiply the weights with the FEM basis functions φi(s)\varphi_i(s) evaluated at the locations. For this, we can construct the observation matrix A\boldsymbol{\mathrm{A}}, with elements Aij=φj(si)A_{ij} = \varphi_j(s_i), which links the FEM basis functions to the locations. This can be done by the function fem_basis in the metric graph object. To illustrate this, let us simulate some observation locations on the graph and construct the matrix:

obs.per.edge <- 100
obs.loc <- NULL
for(i in 1:graph$nE) {
  obs.loc <- rbind(obs.loc,
                   cbind(rep(i,obs.per.edge), runif(obs.per.edge)))
}
n.obs <- obs.per.edge*graph$nE
A <- graph$fem_basis(obs.loc)

In the code, we generate 100100 observation locations per edge in the graph, drawn at random. It can be noted that we assume that the observation locations are given in the format (e,d)(e, d) where ee denotes the edge of the observation and dd is the position on the edge, i.e., the relative distance from the first vertex of the edge.

To compute the precision matrix from the covariance-based rational approximation one can use the precision() method on object returned by the matern.operators() function:

  Q <- precision(op)

As an illustration of the model, let us compute the covariance function between the process at s=(2,0.1)s=(2,0.1), that is, the point at edge 2 and distance on edge 0.1, and all the other mesh points. To this end, we can use the helper function cov_function_mesh that is contained in the op object:

  c_cov <- op$cov_function_mesh(matrix(c(2,0.1),1,2))
  graph$plot_function(c_cov, plotly = TRUE)

Using the model for inference

There is built-in support for computing log-likelihood functions and performing kriging prediction in the rSPDE package which we can use for the graph model. To illustrate this, we use the simulation to create some noisy observations of the process. We generate the observations as Yi=1+2xi13xi2+u(si)+εiY_i = 1 + 2x_{i1} - 3 x_{i2} + u(s_i) + \varepsilon_i, where εiN(0,σe2)\varepsilon_i \sim N(0,\sigma_e^2) is Gaussian measurement noise, x1x_1 and x2x_2 are covariates generated the relative positions of the observations on the graph.

    sigma.e <- 0.1

    x1 <- obs.loc[,1]
    x2 <- obs.loc[,2]

    Y <- 1 + 2*x1 - 3*x2 + as.vector(A %*% u + sigma.e * rnorm(n.obs))

Let us now fit the model. To this end we will use the graph_lme() function (that, for the finite element models, acts as a wrapper for the rspde_lme() function from the rSPDE package). To this end, let us now assemble the data.frame() with the observations, the observation locations and the covariates:

df_data <- data.frame(y = Y, edge_number = obs.loc[,1],
                        distance_on_edge = obs.loc[,2],
                        x1 = x1, x2 = x2)

Let us now add the data to the graph object and plot it:

graph$add_observations(data = df_data, normalized = TRUE)
## Adding observations...
graph$plot(data = "y")

We can now fit the model. To this end, we use the graph_lme() function and set the model to 'WM’.

fit <- graph_lme(y ~ x1 + x2, graph = graph, model = "WM")

Let us obtain a summary of the model:

summary(fit)
## 
## Latent model - Whittle-Matern
## 
## Call:
## graph_lme(formula = y ~ x1 + x2, graph = graph, model = "WM")
## 
## Fixed effects:
##             Estimate Std.error z-value Pr(>|z|)    
## (Intercept)   0.4563    0.7173   0.636  0.52472    
## x1            2.1100    0.2012  10.485  < 2e-16 ***
## x2           -2.3886    0.7174  -3.330  0.00087 ***
## 
## Random effects:
##        Estimate Std.error z-value
## alpha  1.259264  0.022792  55.251
## tau    0.056383  0.005978   9.432
## kappa 15.737382  2.455055   6.410
## 
## Random effects (Matern parameterization):
##       Estimate Std.error z-value
## nu     0.75926   0.02279  33.313
## sigma  1.34595   0.13555   9.929
## range  0.15661   0.02327   6.730
## 
## Measurement error:
##          Estimate Std.error z-value
## std. dev 0.097303  0.006313   15.41
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Log-Likelihood:  -126.9802 
## Number of function calls by 'optim' = 156
## Optimization method used in 'optim' = L-BFGS-B
## 
## Time used to:     fit the model =  29.23189 secs

We can also obtain additional information by using the function glance():

glance(fit)
## # A tibble: 1 × 9
##    nobs  sigma logLik   AIC   BIC deviance df.residual model               alpha
##   <int>  <dbl>  <dbl> <dbl> <dbl>    <dbl>       <dbl> <chr>               <dbl>
## 1   400 0.0973  -127.  268.  296.     254.         393 Covariance-Based M…  1.26

Let us compare the values of the parameters of the latent model with the true ones:

print(data.frame(sigma = c(sigma, fit$matern_coeff$random_effects[2]), 
                   range = c(range, fit$matern_coeff$random_effects[3]),
                   nu = c(nu, fit$matern_coeff$random_effects[1]),
                   row.names = c("Truth", "Estimates")))
##              sigma     range        nu
## Truth     1.300000 0.1500000 0.8000000
## Estimates 1.345954 0.1566062 0.7592644

Kriging

Given that we have estimated the parameters, let us compute the kriging predictor of the field given the observations at the mesh nodes.

We will perform kriging with the predict() method. To this end, we need to provide a data.frame containing the prediction locations, as well as the values of the covariates at the prediction locations.

  df_pred <- data.frame(edge_number = graph$mesh$VtE[,1],
                        distance_on_edge = graph$mesh$VtE[,2],
                        x1 = graph$mesh$VtE[,1],
                        x2 = graph$mesh$VtE[,2])

  u.krig <- predict(fit, newdata = df_pred, normalized = TRUE)

The estimate is shown in the following figure

  graph$plot_function(as.vector(u.krig$mean))  

We can also use the augment() function to easily plot the predictions. Let us a build a 3d plot now and add the observed values on top of the predictions:

p <- augment(fit, newdata = df_pred, normalized = TRUE) %>% 
          graph$plot_function(data = ".fitted", plotly = TRUE)

graph$plot(data = "y", p = p, plotly = TRUE)          

Fitting a model with replicates

Let us now illustrate how to simulate a data set with replicates and then fit a model to such data. To simulate a latent model with replicates, all we do is set the nsim argument to the number of replicates.

  n.rep <- 30
  u.rep <- simulate(op, nsim = n.rep)

Now, let us generate the observed values YY:

  sigma.e <- 0.3
  Y.rep <- A %*% u.rep + sigma.e * matrix(rnorm(n.obs * n.rep), ncol = n.rep)

Note that YY is a matrix with 20 columns, each column containing one replicate. We need to turn y into a vector and create an auxiliary vector repl indexing the replicates of y:

y_vec <- as.vector(Y.rep)
repl <- rep(1:n.rep, each = n.obs)                       

df_data_repl <- data.frame(y = y_vec,
                              edge_number = rep(obs.loc[,1], n.rep),
                              distance_on_edge = rep(obs.loc[,2], n.rep), 
                              repl = repl)

Let us clear the previous observations and add the new data to the graph:

graph$add_observations(data = df_data_repl, normalized = TRUE, 
                            group = "repl", clear_obs = TRUE)
## Adding observations...

We can now fit the model in the same way as before by using the rspde_lme() function. Note that we can optimize in parallel by setting parallel to TRUE. If we do not specify which replicate to consider, in the which_repl argument, all replicates will be considered.

fit_repl <- graph_lme(y ~ -1, graph = graph, model = "WM", parallel = TRUE)

Observe that we have received a warning saying that the Hessian was not positive-definite, which ended up creating NaNs for the standard errors. Indeed, let us see a summary of the fit:

summary(fit_repl)
## 
## Latent model - Whittle-Matern
## 
## Call:
## graph_lme(formula = y ~ -1, graph = graph, model = "WM", parallel = TRUE)
## 
## No fixed effects.
## 
## Random effects:
##       Estimate Std.error z-value
## alpha   1.2787       NaN     NaN
## tau     0.0548       NaN     NaN
## kappa  15.3935    0.4347   35.41
## 
## Random effects (Matern parameterization):
##       Estimate Std.error z-value
## nu    0.778739       NaN     NaN
## sigma 1.324550  0.025395   52.16
## range 0.162145  0.004925   32.92
## 
## Measurement error:
##          Estimate Std.error z-value
## std. dev 0.300945  0.002943   102.2
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Log-Likelihood:  -9741.751 
## Number of function calls by 'optim' = 62
## Optimization method used in 'optim' = L-BFGS-B
## 
## Time used to:     fit the model =  35.73474 secs 
##   set up the parallelization = 2.96431 secs

Let us, then, follow the suggestion from the warning and refit the model setting improve_hessian to TRUE. This will obtain a more precise estimate of the Hessian, which can possibly fix this issue:

fit_repl <- graph_lme(y ~ -1, graph = graph, model = "WM", 
                      parallel = TRUE, improve_hessian = TRUE)

We see that we did not receive any warning now, and the Std. errors were computed accordingly:

summary(fit_repl)
## 
## Latent model - Whittle-Matern
## 
## Call:
## graph_lme(formula = y ~ -1, graph = graph, model = "WM", parallel = TRUE, 
##     improve_hessian = TRUE)
## 
## No fixed effects.
## 
## Random effects:
##       Estimate Std.error z-value
## alpha  1.27874   0.16963   7.538
## tau    0.05480   0.04066   1.348
## kappa 15.39349   4.32698   3.558
## 
## Random effects (Matern parameterization):
##       Estimate Std.error z-value
## nu    0.778739  0.169628   4.591
## sigma 1.324550  0.025395  52.157
## range 0.162145  0.004925  32.924
## 
## Measurement error:
##          Estimate Std.error z-value
## std. dev 0.300945  0.006309    47.7
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
## 
## Log-Likelihood:  -9741.751 
## Number of function calls by 'optim' = 62
## Optimization method used in 'optim' = L-BFGS-B
## 
## Time used to:     fit the model =  31.88367 secs 
##   compute the Hessian = 9.77825 secs 
##   set up the parallelization = 2.803 secs

Let us also take a glance of the fit:

glance(fit_repl)
## # A tibble: 1 × 9
##    nobs sigma logLik    AIC    BIC deviance df.residual model              alpha
##   <int> <dbl>  <dbl>  <dbl>  <dbl>    <dbl>       <dbl> <chr>              <dbl>
## 1 12000 0.301 -9742. 19492. 19521.   19484.       11996 Covariance-Based …  1.28

Let us compare the values of the parameters of the latent model with the true ones:

print(data.frame(sigma = c(sigma, fit_repl$matern_coeff$random_effects[2]), 
                   range = c(range, fit_repl$matern_coeff$random_effects[3]),
                   nu = c(nu, fit_repl$matern_coeff$random_effects[1]),
                   row.names = c("Truth", "Estimates")))
##             sigma     range        nu
## Truth     1.30000 0.1500000 0.8000000
## Estimates 1.32455 0.1621451 0.7787391

Let us do kriging. We will use the same prediction locations as in the previous example. Let us get prediction for replicate 10, then add the original observations on top of them:

p <- augment(fit_repl, which_repl = 10, newdata = df_pred, normalized = TRUE) %>% 
          graph$plot_function(data = ".fitted", plotly = TRUE)

graph$plot(data = "y", group = 10, plotly = TRUE, p = p)