Whittle--Matérn fields with general smoothness
David Bolin, Alexandre B. Simas
Created: 2022-11-23. Last modified: 2024-11-05.
Source:vignettes/fem_models.Rmd
fem_models.Rmd
Introduction
In this vignette we will introduce how to fit Whittle–Matérn fields
with general smoothness based on finite element and rational
approximations. The theory for this approach is provided in Bolin et
al. (2023) and Bolin, Simas, and Xiong (2023). For the
implementation, we make use of the rSPDE
package for the rational approximations.
These models are thus implemented using finite element approximations. Such approximations are not needed for integer smoothness parameters, and for the details about the exact models we refer to the vignettes
For details on the construction of metric graphs, see Working with metric graphs
For further details on data manipulation on metric graphs, see Data manipulation on metric graphs
Constructing the graph and the mesh
We begin by loading the rSPDE
and
MetricGraph
packages:
As an example, we consider the following metric graph
edge1 <- rbind(c(0,0),c(1,0))
edge2 <- rbind(c(0,0),c(0,1))
edge3 <- rbind(c(0,1),c(-1,1))
theta <- seq(from=pi,to=3*pi/2,length.out = 20)
edge4 <- cbind(sin(theta),1+ cos(theta))
edges = list(edge1, edge2, edge3, edge4)
graph <- metric_graph$new(edges = edges)
graph$plot()
To construct a FEM approximation of a Whittle–Matérn field with general smoothness, we must first construct a mesh on the graph.
graph$build_mesh(h = 0.1)
graph$plot(mesh=TRUE)
In the command build_mesh
, the argument h
decides the largest spacing between nodes in the mesh. As can be seen in
the plot, the mesh is very coarse, so let’s reduce the value of
h
and rebuild the mesh:
graph$build_mesh(h = 0.01)
We are now ready to specify the model
for the Whittle–Matérn field
.
For this, we use the matern.operators
function from the
rSPDE
package:
sigma <- 1.3
range <- 0.15
nu <- 0.8
rspde.order <- 2
op <- matern.operators(nu = nu, range = range, sigma = sigma,
parameterization = "matern",
m = rspde.order, graph = graph)
As can be seen in the code, we specify
via the practical correlation range
.
Also, the model is not parametrized by
but instead by
.
Here, sigma
denotes the standard deviation of the field and
nu
is the smoothness parameter, which is related to
via the relation
.
The object op
contains the matrices needed for evaluating
the distribution of the stochastic weights in the FEM approximation.
Let us simulate the field at the mesh locations and plot the result:
u <- simulate(op)
graph$plot_function(X = u, type = "plotly")
If we want to evaluate
at some locations
,
we need to multiply the weights with the FEM basis functions
evaluated at the locations. For this, we can construct the observation
matrix
,
with elements
,
which links the FEM basis functions to the locations. This can be done
by the function fem_basis
in the metric graph object. To
illustrate this, let us simulate some observation locations on the graph
and construct the matrix:
obs.per.edge <- 100
obs.loc <- NULL
for(i in 1:graph$nE) {
obs.loc <- rbind(obs.loc,
cbind(rep(i,obs.per.edge), runif(obs.per.edge)))
}
n.obs <- obs.per.edge*graph$nE
A <- graph$fem_basis(obs.loc)
In the code, we generate observation locations per edge in the graph, drawn at random. It can be noted that we assume that the observation locations are given in the format where denotes the edge of the observation and is the position on the edge, i.e., the relative distance from the first vertex of the edge.
To compute the precision matrix from the covariance-based rational
approximation one can use the precision()
method on object
returned by the matern.operators()
function:
Q <- precision(op)
As an illustration of the model, let us compute the covariance
function between the process at
,
that is, the point at edge 2 and distance on edge 0.1, and all the other
mesh points. To this end, we can use the helper function
cov_function_mesh
that is contained in the op
object:
Using the model for inference
There is built-in support for computing log-likelihood functions and
performing kriging prediction in the rSPDE
package which we
can use for the graph model. To illustrate this, we use the simulation
to create some noisy observations of the process. We generate the
observations as
,
where
is Gaussian measurement noise,
and
are covariates generated the relative positions of the observations on
the graph.
sigma.e <- 0.1
x1 <- obs.loc[,1]
x2 <- obs.loc[,2]
Y <- 1 + 2*x1 - 3*x2 + as.vector(A %*% u + sigma.e * rnorm(n.obs))
Let us now fit the model. To this end we will use the
graph_lme()
function (that, for the finite element models,
acts as a wrapper for the rspde_lme()
function from the
rSPDE
package). To this end, let us now assemble the
data.frame()
with the observations, the observation
locations and the covariates:
df_data <- data.frame(y = Y, edge_number = obs.loc[,1],
distance_on_edge = obs.loc[,2],
x1 = x1, x2 = x2)
Let us now add the data to the graph object and plot it:
graph$add_observations(data = df_data, normalized = TRUE)
## Adding observations...
## list()
graph$plot(data = "y")
We can now fit the model. To this end, we use the
graph_lme()
function and set the model to
'WM
’.
fit <- graph_lme(y ~ x1 + x2, graph = graph, model = "WM")
Let us obtain a summary of the model:
summary(fit)
##
## Latent model - Whittle-Matern
##
## Call:
## graph_lme(formula = y ~ x1 + x2, graph = graph, model = "WM")
##
## Fixed effects:
## Estimate Std.error z-value Pr(>|z|)
## (Intercept) 0.4326 0.7208 0.600 0.54843
## x1 2.1130 0.2021 10.456 < 2e-16 ***
## x2 -2.3626 0.7190 -3.286 0.00102 **
##
## Random effects:
## Estimate Std.error z-value
## alpha 1.254255 0.023625 53.091
## tau 0.057719 0.006288 9.179
## kappa 15.560492 2.454061 6.341
##
## Random effects (Matern parameterization):
## Estimate Std.error z-value
## nu 0.75426 0.02362 31.927
## sigma 1.34735 0.13608 9.901
## range 0.15786 0.02365 6.676
##
## Measurement error:
## Estimate Std.error z-value
## std. dev 0.097240 0.006306 15.42
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -126.9842
## Number of function calls by 'optim' = 183
## Optimization method used in 'optim' = L-BFGS-B
##
## Time used to: fit the model = 33.3947 secs
We can also obtain additional information by using the function
glance()
:
glance(fit)
## # A tibble: 1 × 9
## nobs sigma logLik AIC BIC deviance df.residual model alpha
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 400 0.0972 -127. 268. 296. 254. 393 Covariance-Based M… 1.25
Let us compare the values of the parameters of the latent model with the true ones:
print(data.frame(sigma = c(sigma, fit$matern_coeff$random_effects[2]),
range = c(range, fit$matern_coeff$random_effects[3]),
nu = c(nu, fit$matern_coeff$random_effects[1]),
row.names = c("Truth", "Estimates")))
## sigma range nu
## Truth 1.300000 0.1500000 0.8000000
## Estimates 1.347354 0.1578632 0.7542551
Kriging
Given that we have estimated the parameters, let us compute the kriging predictor of the field given the observations at the mesh nodes.
We will perform kriging with the predict()
method. To
this end, we need to provide a data.frame
containing the
prediction locations, as well as the values of the covariates at the
prediction locations.
df_pred <- data.frame(edge_number = graph$mesh$VtE[,1],
distance_on_edge = graph$mesh$VtE[,2],
x1 = graph$mesh$VtE[,1],
x2 = graph$mesh$VtE[,2])
u.krig <- predict(fit, newdata = df_pred, normalized = TRUE)
The estimate is shown in the following figure
graph$plot_function(as.vector(u.krig$mean))
We can also use the augment()
function to easily plot
the predictions. Let us a build a 3d plot now and add the observed
values on top of the predictions:
Fitting a model with replicates
Let us now illustrate how to simulate a data set with replicates and
then fit a model to such data. To simulate a latent model with
replicates, all we do is set the nsim
argument to the
number of replicates.
n.rep <- 30
u.rep <- simulate(op, nsim = n.rep)
Now, let us generate the observed values :
Note that
is a matrix with 20 columns, each column containing one replicate. We
need to turn y
into a vector and create an auxiliary vector
repl
indexing the replicates of y
:
y_vec <- as.vector(Y.rep)
repl <- rep(1:n.rep, each = n.obs)
df_data_repl <- data.frame(y = y_vec,
edge_number = rep(obs.loc[,1], n.rep),
distance_on_edge = rep(obs.loc[,2], n.rep),
repl = repl)
Let us clear the previous observations and add the new data to the graph:
graph$add_observations(data = df_data_repl, normalized = TRUE,
group = "repl", clear_obs = TRUE)
## Adding observations...
## list()
We can now fit the model in the same way as before by using the
rspde_lme()
function. Note that we can optimize in parallel
by setting parallel
to TRUE
. If we do not
specify which replicate to consider, in the which_repl
argument, all replicates will be considered.
fit_repl <- graph_lme(y ~ -1, graph = graph, model = "WM", parallel = TRUE)
Observe that we have received a warning saying that the Hessian was
not positive-definite, which ended up creating NaN
s for the
standard errors. Indeed, let us see a summary of the fit:
summary(fit_repl)
##
## Latent model - Whittle-Matern
##
## Call:
## graph_lme(formula = y ~ -1, graph = graph, model = "WM", parallel = TRUE)
##
## No fixed effects.
##
## Random effects:
## Estimate Std.error z-value
## alpha 1.28076 NaN NaN
## tau 0.05429 NaN NaN
## kappa 15.47022 0.43245 35.77
##
## Random effects (Matern parameterization):
## Estimate Std.error z-value
## nu 0.780759 NaN NaN
## sigma 1.323295 0.025305 52.29
## range 0.161550 0.004889 33.05
##
## Measurement error:
## Estimate Std.error z-value
## std. dev 0.301126 0.002945 102.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -9741.752
## Number of function calls by 'optim' = 86
## Optimization method used in 'optim' = L-BFGS-B
##
## Time used to: fit the model = 47.20366 secs
## set up the parallelization = 2.63058 secs
Let us, then, follow the suggestion from the warning and refit the
model setting improve_hessian
to TRUE
. This
will obtain a more precise estimate of the Hessian, which can possibly
fix this issue:
fit_repl <- graph_lme(y ~ -1, graph = graph, model = "WM",
parallel = TRUE, improve_hessian = TRUE)
We see that we did not receive any warning now, and the Std. errors were computed accordingly:
summary(fit_repl)
##
## Latent model - Whittle-Matern
##
## Call:
## graph_lme(formula = y ~ -1, graph = graph, model = "WM", parallel = TRUE,
## improve_hessian = TRUE)
##
## No fixed effects.
##
## Random effects:
## Estimate Std.error z-value
## alpha 1.280759 0.030016 42.669
## tau 0.054289 0.007189 7.552
## kappa 15.470222 0.901108 17.168
##
## Random effects (Matern parameterization):
## Estimate Std.error z-value
## nu 0.780759 0.030016 26.01
## sigma 1.323295 0.025305 52.29
## range 0.161550 0.004889 33.05
##
## Measurement error:
## Estimate Std.error z-value
## std. dev 0.301126 0.003117 96.62
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Log-Likelihood: -9741.752
## Number of function calls by 'optim' = 86
## Optimization method used in 'optim' = L-BFGS-B
##
## Time used to: fit the model = 43.04298 secs
## compute the Hessian = 10.1717 secs
## set up the parallelization = 2.63041 secs
Let us also take a glance of the fit:
glance(fit_repl)
## # A tibble: 1 × 9
## nobs sigma logLik AIC BIC deviance df.residual model alpha
## <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 12000 0.301 -9742. 19492. 19521. 19484. 11996 Covariance-Based … 1.28
Let us compare the values of the parameters of the latent model with the true ones:
print(data.frame(sigma = c(sigma, fit_repl$matern_coeff$random_effects[2]),
range = c(range, fit_repl$matern_coeff$random_effects[3]),
nu = c(nu, fit_repl$matern_coeff$random_effects[1]),
row.names = c("Truth", "Estimates")))
## sigma range nu
## Truth 1.300000 0.15000 0.800000
## Estimates 1.323295 0.16155 0.780759
Let us do kriging. We will use the same prediction locations as in the previous example. Let us get prediction for replicate 10, then add the original observations on top of them: