Skip to contents

Introduction

In this vignette we will present the R-INLA implementation of the rational SPDE approach. For theoretical details we refer the reader to the Rational approximation with the rSPDE package vignette and to Bolin, Simas, and Xiong (2023).

We begin by providing a step-by-step illustration on how to use our implementation. To this end we will consider a real world data set that consists of precipitation measurements from the Paraná region in Brazil.

After the initial model fitting, we will show how to change some parameters of the model. In the end, we will also provide an example in which we have replicates.

It is important to mention that one can improve the performance by using the PARDISO solver. Please, go to https://www.pardiso-project.org/r-inla/#license to apply for a license. Also, use inla.pardiso() for instructions on how to enable the PARDISO sparse library.

Example with real data

To illustrate our implementation of rSPDE in R-INLA we will consider a dataset available in R-INLA. This data has also been used to illustrate the SPDE approach, see for instance the book Advanced Spatial Modeling with Stochastic Partial Differential Equations Using R and INLA and also the vignette Spatial Statistics using R-INLA and Gaussian Markov random fields. See also Lindgren, Rue, and Lindström (2011) for theoretical details on the standard SPDE approach.

The data consist of precipitation measurements from the Paraná region in Brazil and were provided by the Brazilian National Water Agency. The data were collected at 616 gauge stations in Paraná state, south of Brazil, for each day in 2011.

An rSPDE model for precipitation

We will follow the vignette Spatial Statistics using R-INLA and Gaussian Markov random fields. As precipitation data are always positive, we will assume it is Gamma distributed. R-INLA uses the following parameterization of the Gamma distribution, Γ(μ,ϕ):π(y)=1Γ(ϕ)(ϕμ)ϕyϕ1exp(ϕyμ).\Gamma(\mu, \phi): \pi (y) = \frac{1}{\Gamma(\phi)} \left(\frac{\phi}{\mu}\right)^{\phi} y^{\phi - 1} \exp\left(-\frac{\phi y}{\mu}\right) . In this parameterization, the distribution has expected value E(x)=μE(x) = \mu and variance V(x)=μ2/(ϕ)V(x) = \mu^2/(\phi), where1/ϕ1/\phi is a dispersion parameter.

In this example μ\mu will be modeled using a stochastic model that includes both covariates and spatial structure, resulting in the latent Gaussian model for the precipitation measurements yiμ(si),θΓ(μ(si),cϕ)log(μ(s))=η(s)=kfk(ck(s))+u(s)θπ(θ),\begin{align} y_i\mid \mu(s_i), \theta &\sim \Gamma(\mu(s_i),c\phi)\\ \log (\mu(s)) &= \eta(s) = \sum_k f_k(c_k(s))+u(s)\\ \theta &\sim \pi(\theta) \end{align},

where yiy_i denotes the measurement taken at location sis_i, ck(s)c_k(s) are covariates, u(s)u(s) is a mean-zero Gaussian Matérn field, and θ\theta is a vector containing all parameters of the model, including smoothness of the field. That is, by using the rSPDE model we will also be able to estimate the smoothness of the latent field.

Examining the data

We will be using R-INLA. To install R-INLA go to R-INLA Project.

We begin by loading some libraries we need to get the data and build the plots.

Let us load the data and the border of the region

data(PRprec)
data(PRborder)

The data frame contains daily measurements at 616 stations for the year 2011, as well as coordinates and altitude information for the measurement stations. We will not analyze the full spatio-temporal data set, but instead look at the total precipitation in January, which we calculate as

Y <- rowMeans(PRprec[, 3 + 1:31])

In the next snippet of code, we extract the coordinates and altitudes and remove the locations with missing values.

ind <- !is.na(Y)
Y <- Y[ind]
coords <- as.matrix(PRprec[ind, 1:2])
alt <- PRprec$Altitude[ind]

Let us build plot the precipitation observations using ggplot:

ggplot() +
  geom_point(aes(
    x = coords[, 1], y = coords[, 2],
    colour = Y
  ), size = 2, alpha = 1) +
  geom_path(aes(x = PRborder[, 1], y = PRborder[, 2])) +
  geom_path(aes(x = PRborder[1034:1078, 1], y = PRborder[
    1034:1078,
    2
  ]), colour = "red") + 
  scale_color_viridis()

The red line in the figure shows the coast line, and we expect the distance to the coast to be a good covariate for precipitation. This covariate is not available, so let us calculate it for each observation location:

seaDist <- apply(spDists(coords, PRborder[1034:1078, ],
  longlat = TRUE
), 1, min)

Now, let us plot the precipitation as a function of the possible covariates:

par(mfrow = c(2, 2))
plot(coords[, 1], Y, cex = 0.5, xlab = "Longitude")
plot(coords[, 2], Y, cex = 0.5, xlab = "Latitude")
plot(seaDist, Y, cex = 0.5, xlab = "Distance to sea")
plot(alt, Y, cex = 0.5, xlab = "Altitude")

par(mfrow = c(1, 1))

Creating the rSPDE model

To use the R-INLA implementation of the rSPDE model we need to load the functions:

The rSPDE-INLA implementation is very reminiscent of R-INLA, so its usage should be straightforward for R-INLA users. For instance, to create a rSPDE model, one would use rspde.matern() in place of inla.spde2.matern(). To create an index, one should use rspde.make.index() in place of inla.spde.make.index(). To create the A matrix, one should use rspde.make.A() in place of inla.spde.make.A(), and so on.

The main differences when comparing the arguments between the rSPDE-INLA implementation and the standard SPDE implementation in R-INLA, are the nu and rspde.order arguments, which are present in rSPDE-INLA implementation. We will see below how use these arguments.

Mesh

We can use fmesher for creating the mesh. We begin by loading the fmesher package:

Let us create a mesh which is based on a non-convex hull to avoid adding many small triangles outside the domain of interest:

prdomain <- fm_nonconvex_hull(coords, -0.03, -0.05, resolution = c(100, 100))
prmesh <- fm_mesh_2d(boundary = prdomain, max.edge = c(0.45, 1), cutoff = 0.2)

plot(prmesh, asp = 1, main = "")
lines(PRborder, col = 3)
points(coords[, 1], coords[, 2], pch = 19, cex = 0.5, col = "red")

The observation matrix

We now create the AA matrix, that connects the mesh to the observation locations and then create the rSPDE model.

For this task, as we mentioned earlier, we need to use an rSPDEspecific function, whose name is very reminiscent to R-INLA’s standard SPDE approach, namely rspde.make.A() (in place of R-INLA’s inla.spde.make.A()). The reason for the need of this specific function is that the size of the AA matrix depends on the order of the rational approximation. The details can be found in the introduction of the Rational approximation with the rSPDE package vignette.

The default order is 2 for our covariance-based rational approximation. As mentioned in the introduction of the Rational approximation with the rSPDE package vignette, an approximation of order 2 in the covariance-based rational approximation has approximately the same computational cost as the operator-based rational approximation of order 1.

Recall that the latent process uu is a solution of (κ2IΔ)α/2(τu)=𝒲,(\kappa^2 I-\Delta)^{\alpha/2}(\tau u) = \mathcal{W}, where α=ν+d/2\alpha = \nu + d/2. We want to estimate all three parameters τ,κ\tau,\kappa and ν\nu, which is the default option of
the rSPDE-INLA implementation. However, there is also an option to fix the smoothness parameter ν\nu to some predefined value and only estimate τ\tau and κ\kappa. This will be discussed later.

In this first example we will assume we want a rational approximation of order 1. To this end we can use the rspde.make.A() function. Since we will assume order 1 and that we want to estimate smoothness, which are the default options in this function, the required parameters are simply the mesh and the locations:

Abar <- rspde.make.A(mesh = prmesh, loc = coords)

Setting up the rSPDE model

To set up an rSPDEmodel, all we need is the mesh. By default it will assume that we want to estimate the smoothness parameter ν\nu and to do a covariance-based rational approximation of order 2.

Later in this vignette we will also see other options for setting up rSPDE models such as keeping the smoothness parameter fixed and/or increasing the order of the covariance-based rational approximation.

Therefore, to set up a model all we have to do is use the rspde.matern() function:

rspde_model <- rspde.matern(mesh = prmesh)

Note that this function is very reminiscent of R-INLA’s inla.spde2.matern() function. This is a pattern we have tried to keep consistent in the package: All the rSPDE versions of some R-INLA function will either replace inla or inla.spde or inla.spde2 by rspde.

The inla.stack

Since the covariates are already evaluated at the observation locations, we only want to apply the AA matrix to the spatial effect and not the fixed effects. We can use the inla.stack() function.

The difference, however, is that we need to use the function rspde.make.index() (in place of the standard inla.spde.make.index()) to create the index.

If one is using the default options, that is, to estimate the smoothness parameter ν\nu and to do a rational approximation of order 2, the usage of rspde.make.index() is identical to the usage of inla.spde.make.index():

mesh.index <- rspde.make.index(name = "field", mesh = prmesh)

We can then create the stack in a standard manner:

stk.dat <- inla.stack(
  data = list(y = Y), A = list(Abar, 1), tag = "est",
  effects = list(
    c(
      mesh.index
    ),
    list(
      seaDist = inla.group(seaDist),
      Intercept = 1
    )
  )
)

Here the observation matrix AA is applied to the spatial effect and the intercept while an identity observation matrix, denoted by 11, is applied to the covariates. This means the covariates are unaffected by the observation matrix.

The observation matrices in A=list(Abar,1)A=list(Abar,1) are used to link the corresponding elements in the effects-list to the observations. Thus in our model the latent spatial field mesh.index and the intercept are linked to the log-expectation of the observations, i.e. η(s)\eta(s), through the AA-matrix. The covariates, on the other hand, are linked directly to η(s)\eta(s). The stk.dat object defined above implies the following principal linkage between model components and observations η(s)Ax(s)+A Intercept+seaDist.\eta(s) \sim A x(s) + A \text{ Intercept} + \text{seaDist}.η(s)\eta(s) will then be used in the observation-likelihood, yiη(si),θΓ(exp(η(si)),cϕ).y_i\mid \eta(s_i),\theta \sim \Gamma(\exp(\eta (s_i)), c\phi).

Model fitting

We will build a model using the distance to the sea xix_i as a covariate through an improper CAR(1) model with βij=1(ij)\beta_{ij}=1(i\sim j), which R-INLA calls a random walk of order 1.

f.s <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model)

Here -1 is added to remove R’s implicit intercept, which is replaced by the explicit +Intercept from when we created the stack.

To fit the model we proceed as in the standard SPDE approach and we simply call inla().

rspde_fit <- inla(f.s,
  family = "Gamma", data = inla.stack.data(stk.dat),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat), compute = TRUE)
)

INLA results

We can look at some summaries of the posterior distributions for the parameters, for example the fixed effects (i.e. the intercept) and the hyper-parameters (i.e. dispersion in the gamma likelihood, the precision of the RW1, and the parameters of the spatial field):

summary(rspde_fit)
## Time used:
##     Pre = 0.398, Running = 3.12, Post = 0.0632, Total = 3.58 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.862    1.941      2.021 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                    mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.664    1.051     12.713
## Precision for seaDist                          8359.916 5482.964   2673.332
## Theta1 for field                                 -1.130    0.392     -1.928
## Theta2 for field                                  1.514    0.265      0.997
## Theta3 for field                                 -0.299    0.370     -0.998
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations   14.623     16.849   14.536
## Precision for seaDist                          6908.132  22970.387 4902.633
## Theta1 for field                                 -1.121     -0.384   -1.082
## Theta2 for field                                  1.512      2.041    1.504
## Theta3 for field                                 -0.309      0.460   -0.355
## 
## Marginal log-Likelihood:  -1255.37 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Let θ1=Theta1\theta_1 = \textrm{Theta1}, θ2=Theta2\theta_2=\textrm{Theta2} and θ3=Theta3\theta_3=\textrm{Theta3}. In terms of the SPDE (κ2IΔ)α/2(τu)=𝒲,(\kappa^2 I - \Delta)^{\alpha/2}(\tau u) = \mathcal{W}, where α=ν+d/2\alpha = \nu + d/2, we have that τ=exp(θ1),κ=exp(θ2),\tau = \exp(\theta_1),\quad \kappa = \exp(\theta_2), and by default ν=2(exp(θ3)1+exp(θ3)).\nu = 2\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big). The number 2 comes from the upper bound for ν\nu, which will be discussed later in this vignette.

In general, we have ν=νUB(exp(θ3)1+exp(θ3)),\nu = \nu_{UB}\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big), where νUB\nu_{UB} is the value of the upper bound for the smoothness parameter ν\nu.

Another choice for prior for ν\nu is a truncated lognormal distribution and will also be discussed later in this vignette.

rSPDE-INLA results

We can obtain outputs with respect to parameters in the original scale by using the function rspde.result():

result_fit <- rspde.result(rspde_fit, "field", rspde_model)
summary(result_fit)
##           mean       sd 0.025quant 0.5quant 0.975quant     mode
## tau   0.348142 0.136787   0.146647 0.326680    0.67654 0.283755
## kappa 4.705400 1.266620   2.721250 4.534210    7.66589 4.206120
## nu    0.855704 0.175501   0.541177 0.845118    1.22224 0.811913

As mentioned above, when we create the model object with rspde.matern(), we must choose an upper bound for nu by using the argument nu.upper.bound. If such an argument is not passed, the default value of 2 will be used. However, if the mean or mode of nu is too close to nu.upper.bound, a warning will be given suggesting the user to increase nu.upper.bound and refit the data.

To create plots of the posterior marginal densities, we can use the gg_df() function, which creates ggplot2-friendly data frames. The following figure shows the posterior marginal densities of the three parameters using the gg_df() function.

posterior_df_fit <- gg_df(result_fit)

ggplot(posterior_df_fit) + geom_line(aes(x = x, y = y)) + 
facet_wrap(~parameter, scales = "free") + labs(y = "Density")

This function is reminiscent to the inla.spde.result() function with the main difference that it has the summary() and plot() methods implemented.

We can also obtain the results for the matern parameterization by setting the parameterization argument to matern:

result_fit_matern <- rspde.result(rspde_fit, "field", 
                      rspde_model, parameterization = "matern")
summary(result_fit_matern)
##             mean       sd 0.025quant 0.5quant 0.975quant     mode
## std.dev 0.332562 0.039529   0.259477 0.331073   0.416722 0.332710
## range   0.598183 0.181402   0.307701 0.575824   1.025330 0.554784
## nu      0.855704 0.175501   0.541177 0.845118   1.222240 0.811913

In a similar manner, we can obtain posterior plots on the matern parameterization:

posterior_df_fit_matern <- gg_df(result_fit_matern)

ggplot(posterior_df_fit_matern) + geom_line(aes(x = x, y = y)) + 
facet_wrap(~parameter, scales = "free") + labs(y = "Density")

Predictions

Let us now obtain predictions (i.e. do kriging) of the expected precipitation on a dense grid in the region.

We begin by creating the grid in which we want to do the predictions. To this end, we can use the rspde.mesh.projector() function. This function has the same arguments as the function inla.mesh.projector(), with the only difference being that the rSPDE version also has an argument nu and an argument rspde.order. Thus, we proceed in the same fashion as we would in R-INLA’s standard SPDE implementation:

nxy <- c(150, 100)
projgrid <- rspde.mesh.projector(prmesh,
  xlim = range(PRborder[, 1]),
  ylim = range(PRborder[, 2]), dims = nxy
)

This lattice contains 150 × 100 locations. One can easily change the resolution of the kriging prediction by changing nxy. Let us find the cells that are outside the region of interest so that we do not plot the estimates there.

xy.in <- inout(projgrid$lattice$loc, cbind(PRborder[, 1], PRborder[, 2]))

Let us plot the locations that we will do prediction:

coord.prd <- projgrid$lattice$loc[xy.in, ]
plot(coord.prd, type = "p", cex = 0.1)
lines(PRborder)
points(coords[, 1], coords[, 2], pch = 19, cex = 0.5, col = "red")

Now, there are a few ways we could calculate the kriging prediction. The simplest way is to evaluate the mean of all individual random effects in the linear predictor and then to calculate the exponential of their sum (since μ(s)=exp(η(s))\mu(s)=\exp(\eta(s)) ). A more accurate way is to calculate the prediction jointly with the estimation, which unfortunately is quite computationally expensive if we do prediction on a fine grid. However, in this illustration, we proceed with this option to show how one can do it.

To this end, first, link the prediction coordinates to the mesh nodes through an AA matrix

A.prd <- projgrid$proj$A[xy.in, ]

Since we are using distance to the sea as a covariate, we also have to calculate this covariate for the prediction locations.

seaDist.prd <- apply(spDists(coord.prd,
  PRborder[1034:1078, ],
  longlat = TRUE
), 1, min)

We now make a stack for the prediction locations. We have no data at the prediction locations, so we set y= NA. We then join this stack with the estimation stack.

ef.prd <- list(
  c(mesh.index),
  list(
    long = inla.group(coord.prd[
      ,
      1
    ]), lat = inla.group(coord.prd[, 2]),
    seaDist = inla.group(seaDist.prd),
    Intercept = 1
  )
)
stk.prd <- inla.stack(
  data = list(y = NA),
  A = list(A.prd, 1), tag = "prd",
  effects = ef.prd
)
stk.all <- inla.stack(stk.dat, stk.prd)

Doing the joint estimation takes a while, and we therefore turn off the computation of certain things that we are not interested in, such as the marginals for the random effect. We will also use a simplified integration strategy (actually only using the posterior mode of the hyper-parameters) through the command control.inla = list(int.strategy = "eb"), i.e. empirical Bayes.

rspde_fitprd <- inla(f.s,
  family = "Gamma",
  data = inla.stack.data(stk.all),
  control.predictor = list(
    A = inla.stack.A(stk.all),
    compute = TRUE, link = 1
  ),
  control.compute = list(
    return.marginals = FALSE,
    return.marginals.predictor = FALSE
  ),
  control.inla = list(int.strategy = "eb")
)

We then extract the indices to the prediction nodes and then extract the mean and the standard deviation of the response:

id.prd <- inla.stack.index(stk.all, "prd")$data
m.prd <- rspde_fitprd$summary.fitted.values$mean[id.prd]
sd.prd <- rspde_fitprd$summary.fitted.values$sd[id.prd]

Finally, we plot the results:

# Plot the predictions
pred_df <- data.frame(x1 = coord.prd[,1],
                      x2 = coord.prd[,2],
                      mean = m.prd,
                      sd = sd.prd)

ggplot(pred_df, aes(x = x1, y = x2, fill = mean)) +
  geom_raster() +
  scale_fill_viridis()

Then, the std. deviations:

ggplot(pred_df, aes(x = x1, y = x2, fill = sd)) +
  geom_raster() + scale_fill_viridis()

An example with replicates

For this example we will simulate a data with replicates. We will use the same example considered in the Rational approximation with the rSPDE package vignette (the only difference is the way the data is organized). We also refer the reader to this vignette for a description of the function matern.operators(), along with its methods (for instance, the simulate() method).

Simulating the data

Let us consider a simple Gaussian linear model with 30 independent replicates of a latent spatial field x(𝐬)x(\mathbf{s}), observed at the same mm locations, {𝐬1,,𝐬m}\{\mathbf{s}_1 , \ldots , \mathbf{s}_m \}, for each replicate. For each i=1,,m,i = 1,\ldots,m, we have

yi=x1(𝐬i)+εi,=yi+29m=x30(𝐬i)+εi+29m,\begin{align} y_i &= x_1(\mathbf{s}_i)+\varepsilon_i,\\ \vdots &= \vdots\\ y_{i+29m} &= x_{30}(\mathbf{s}_i) + \varepsilon_{i+29m}, \end{align}

where ε1,,ε30m\varepsilon_1,\ldots,\varepsilon_{30m} are iid normally distributed with mean 0 and standard deviation 0.1.

We use the basis function representation of x()x(\cdot) to define the AA matrix linking the point locations to the mesh. We also need to account for the fact that we have 30 replicates at the same locations. To this end, the AA matrix we need can be generated by spde.make.A() function. The reason being that we are sampling x()x(\cdot) directly and not the latent vector described in the introduction of the Rational approximation with the rSPDE package vignette.

We begin by creating the mesh:

m <- 200
loc_2d_mesh <- matrix(runif(m * 2), m, 2)
mesh_2d <- fm_mesh_2d(
  loc = loc_2d_mesh,
  cutoff = 0.05,
  offset = c(0.1, 0.4),
  max.edge = c(0.05, 0.5)
)
plot(mesh_2d, main = "")
points(loc_2d_mesh[, 1], loc_2d_mesh[, 2])

We then compute the AA matrix, which is needed for simulation, and connects the observation locations to the mesh. To this end we will use the spde.make.A() helper function, which is a wrapper that uses the functions fm_basis(), fm_block() and fm_row_kron() from the fmesher package.

n.rep <- 30
A <- spde.make.A(
  mesh = mesh_2d,
  loc = loc_2d_mesh,
  index = rep(1:m, times = n.rep),
  repl = rep(1:n.rep, each = m)
)

Notice that for the simulated data, we should use the AA matrix from spde.make.A() function instead of the rspde.make.A() function.

We will now simulate a latent process with standard deviation σ=1\sigma=1 and range 0.10.1. We will use ν=0.5\nu=0.5 so that the model has an exponential covariance function. To this end we create a model object with the matern.operators() function:

nu <- 0.5
sigma <- 1
range <- 0.1
kappa <- sqrt(8 * nu) / range
tau <- sqrt(gamma(nu) / (sigma^2 * kappa^(2 * nu) * (4 * pi) * gamma(nu + 1)))
d <- 2
operator_information <- matern.operators(
  mesh = mesh_2d,
  nu = nu,
  range = range,
  sigma = sigma,
  m = 2,
  parameterization = "matern"
)

More details on this function can be found at the Rational approximation with the rSPDE package vignette.

To simulate the latent process all we need to do is to use the simulate() method on the operator_information object. We then obtain the simulated data yy by connecting with the AA matrix and adding the gaussian noise.

set.seed(1)
u <- simulate(operator_information, nsim = n.rep)
y <- as.vector(A %*% as.vector(u)) +
  rnorm(m * n.rep) * 0.1

The first replicate of the simulated random field as well as the observation locations are shown in the following figure.

proj <- fm_evaluator(mesh_2d, dims = c(100, 100))

df_field <- data.frame(x = proj$lattice$loc[,1],
                        y = proj$lattice$loc[,2],
                        field = as.vector(fm_evaluate(proj, 
                        field = as.vector(u[, 1]))),
                        type = "field")

df_loc <- data.frame(x = loc_2d_mesh[, 1],
                      y = loc_2d_mesh[, 2],
                      field = y[1:m],
                      type = "locations")
df_plot <- rbind(df_field, df_loc)

ggplot(df_plot) + aes(x = x, y = y, fill = field) +
        facet_wrap(~type) + xlim(0,1) + ylim(0,1) + 
        geom_raster(data = df_field) +
        geom_point(data = df_loc, aes(colour = field),
        show.legend = FALSE) + 
        scale_fill_viridis() + scale_colour_viridis()
## Warning: Removed 7648 rows containing missing values or values outside the scale range
## (`geom_raster()`).

Fitting the R-INLA rSPDE model

Let us then use the rational SPDE approach to fit the data.

We begin by creating the AA matrix and index with replicates, and the inla.stack object. It is important to notice that since we have replicates we should provide the index and repl arguments for rspde.make.A() function, and also the argument n.repl in rspde.make.index() function. They behave identically as in their R-INLA’s counterparts, namely, inla.spde.make.A() and inla.make.index().

Abar.rep <- rspde.make.A(
  mesh = mesh_2d, loc = loc_2d_mesh, index = rep(1:m, times = n.rep),
  repl = rep(1:n.rep, each = m)
)
mesh.index.rep <- rspde.make.index(
  name = "field", mesh = mesh_2d,
  n.repl = n.rep
)

st.dat.rep <- inla.stack(
  data = list(y = y),
  A = Abar.rep,
  effects = mesh.index.rep
)

We now create the model object.

rspde_model.rep <- rspde.matern(mesh = mesh_2d, parameterization = "spde") 

Finally, we create the formula and fit. It is extremely important not to forget the replicate argument when building the formula as inla() function will not produce warning and might fit some meaningless model.

f.rep <-
  y ~ -1 + f(field,
    model = rspde_model.rep,
    replicate = field.repl
  )
rspde_fit.rep <-
  inla(f.rep,
    data = inla.stack.data(st.dat.rep),
    family = "gaussian",
    control.predictor =
      list(A = inla.stack.A(st.dat.rep))
  )

We can get the summary:

summary(rspde_fit.rep)
## Time used:
##     Pre = 0.232, Running = 26.2, Post = 2, Total = 28.5 
## Random effects:
##   Name     Model
##     field CGeneric
## 
## Model hyperparameters:
##                                           mean    sd 0.025quant 0.5quant
## Precision for the Gaussian observations 89.184 4.220      81.13   89.095
## Theta1 for field                        -3.087 0.442      -4.00   -3.071
## Theta2 for field                         3.066 0.071       2.93    3.064
## Theta3 for field                        -0.658 0.256      -1.13   -0.667
##                                         0.975quant   mode
## Precision for the Gaussian observations     97.743 88.944
## Theta1 for field                            -2.266 -3.000
## Theta2 for field                             3.212  3.054
## Theta3 for field                            -0.128 -0.708
## 
## Marginal log-Likelihood:  -4477.60 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

and the summary in the user’s scale:

result_fit_rep <- rspde.result(rspde_fit.rep, "field", rspde_model.rep)
summary(result_fit_rep)
##             mean        sd 0.025quant   0.5quant 0.975quant       mode
## tau    0.0501379 0.0219397  0.0184374  0.0465781   0.103048  0.0392182
## kappa 21.5086000 1.5257700 18.8061000 21.3975000  24.786700 21.1028000
## nu     0.6865570 0.1141600  0.4883110  0.6769930   0.933506  0.6511730
result_df <- data.frame(
  parameter = c("tau", "kappa", "nu"),
  true = c(tau, kappa, nu),
  mean = c(
    result_fit_rep$summary.tau$mean,
    result_fit_rep$summary.kappa$mean,
    result_fit_rep$summary.nu$mean
  ),
  mode = c(
    result_fit_rep$summary.tau$mode,
    result_fit_rep$summary.kappa$mode,
    result_fit_rep$summary.nu$mode
  )
)
print(result_df)
##   parameter        true        mean        mode
## 1       tau  0.08920621  0.05013785  0.03921817
## 2     kappa 20.00000000 21.50856687 21.10278471
## 3        nu  0.50000000  0.68655700  0.65117277

An example with a non-stationary model

It is also possible to consider models in which σ\sigma (std. deviation) and ρ\rho (range parameter) are non-stationary. One can also use the parameterization in terms of the SPDE parameters κ\kappa and τ\tau.

An example of such a model is given in the vignette inlabru implementation of the rational SPDE approach.

Further options of the rSPDE-INLA implementation

We will now discuss some of the arguments that were introduced in our R-INLA implementation of the rational approximation that are not present in R-INLA’s standard SPDE implementation.

In each case we will provide an illustrative example.

Changing the upper bound for the smoothness parameter

When we fit a rspde.matern() model we need to provide an upper bound for the smoothness parameter ν\nu. The reason for that is that the sparsity of the precision matrix should be kept fixed during R-INLA’s estimation and the higher the value of ν\nu the denser the precision matrix gets.

This means that the higher the value of ν\nu, the higher the computational cost to fit the model. Therefore, ideally, want to choose an upper bound for ν\nu as small as possible.

To change the value of the upper bound for the smoothness parameter, we must change the argument nu.upper.bound. The default value for nu.upper.bound is 2. Other common choices for nu.upper.bound are 1, 3 and 4.

It is clear from the discussion above that the smaller the value of nu.upper.bound the faster the estimation procedure will be.

However, if we choose a value of nu.upper.bound which is too low, the “correct” value of ν\nu might not belong to the interval (0,νUB)(0,\nu_{UB}), where νUB\nu_{UB} is the value of nu.upper.bound. Hence, one might be forced to increase nu.upper.bound and estimate again, which, obviously will increase the computational cost as we will need to do more than one estimation.

Let us illustrate by considering the same model we considered above for the precipitation in Paraná region in Brazil and consider nu.upper.bound equal to 4, which is generally a good choice for nu.upper.bound.

We simply use the function rspde.matern() with the argument nu.upper.bound set to 4:

rspde_model_2 <- rspde.matern(mesh = prmesh, nu.upper.bound = 4)

Since we are considering the default rspde.order, the AA matrix and the mesh index objects are the same as the previous ones. Let us then update the formula and fit the model:

f.s.2 <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_2)

rspde_fit_2 <- inla(f.s.2,
  family = "Gamma", data = inla.stack.data(stk.dat),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat), compute = TRUE)
)

Let us see the summary of the fit:

summary(rspde_fit_2)
## Time used:
##     Pre = 0.228, Running = 8.73, Post = 0.042, Total = 9 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.862    1.941      2.019 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                    mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.707    1.051     12.744
## Precision for seaDist                          7842.588 4653.002   2326.617
## Theta1 for field                                 -0.617    1.545     -3.472
## Theta2 for field                                  1.426    0.454      0.498
## Theta3 for field                                 -1.673    1.045     -3.844
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations   14.670   1.69e+01   14.599
## Precision for seaDist                          6735.756   1.99e+04 4992.381
## Theta1 for field                                 -0.675   2.60e+00   -0.945
## Theta2 for field                                  1.437   2.28e+00    1.490
## Theta3 for field                                 -1.634   2.61e-01   -1.455
## 
## Marginal log-Likelihood:  -1254.40 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Let us compare with the cost from the previous fit, with the default value of nu.upper.bound of 2:

# nu.upper.bound = 2
rspde_fit$cpu.used
##        Pre    Running       Post      Total 
## 0.39814997 3.12240434 0.06321526 3.58376956
# nu.upper.bound = 4
rspde_fit_2$cpu.used
##        Pre    Running       Post      Total 
## 0.22770286 8.73004937 0.04201555 8.99976778

We can see that the fit for nu.upper.bound equal to 2 was considerably faster.

Finally, let us get the result results for the field and see the estimate of ν\nu:

result_fit_2 <- rspde.result(rspde_fit_2, "field", rspde_model_2)
summary(result_fit_2)
##           mean       sd 0.025quant 0.5quant 0.975quant      mode
## tau   1.919180 5.588360  0.0317543 0.499897   13.01900 0.0680335
## kappa 4.596140 2.094490  1.6605900 4.223400    9.73747 3.5035400
## nu    0.799092 0.579464  0.0856979 0.659977    2.24465 0.2614170

Changing the order of the rational approximation

To change the order of the rational approximation all we have to do is set the argument rspde.order to the desired value. The current available possibilities are 1,2,3,…, up to 8.

The higher the order of the rational approximation, the more accurate the results will be, however, the higher the computational cost will be.

The default rspde.order of 1 is generally a good choice, fast, and reasonably accurate. See the vignette Rational approximation with the rSPDE package for further details on the order of the rational approximation and some comparison with the Matérn covariance.

Let us fit the above model with the covariance-based rational approximation of order 3. Since we are changing the order of the rational approximation, that is, we are changing the rspde.order argument, we need to recompute the AA matrix and the mesh index. Therefore, we proceed as follows:

  • We build a new model:
rspde_model_order_3 <- rspde.matern(mesh = prmesh, 
  rspde.order = 3
)
  • We create a new AA matrix:
Abar_3 <- rspde.make.A(mesh = prmesh, loc = coords, rspde.order = 3)
  • We create a new index:
mesh.index.3 <- rspde.make.index(
  name = "field", mesh = prmesh,
  rspde.order = 3
)

Now the remaining steps are the same as before:

stk.dat.3 <- inla.stack(
  data = list(y = Y), A = list(Abar_3, 1), tag = "est",
  effects = list(
    c(
      mesh.index.3
    ),
    list(
      long = inla.group(coords[, 1]),
      lat = inla.group(coords[, 2]),
      seaDist = inla.group(seaDist),
      Intercept = 1
    )
  )
)

f.s.3 <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_order_3)

rspde_fit_order_3 <- inla(f.s.3,
  family = "Gamma", data = inla.stack.data(stk.dat.3),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat.3), compute = TRUE)
)

Let us see the summary:

summary(rspde_fit_order_3)
## Time used:
##     Pre = 0.23, Running = 13.7, Post = 0.0804, Total = 14 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.863    1.941      2.019 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                   mean      sd 0.025quant
## Precision-parameter for the Gamma observations   14.70    1.05      12.74
## Precision for seaDist                          7554.02 4239.38    2384.07
## Theta1 for field                                 -3.41    2.20      -8.53
## Theta2 for field                                  1.88    0.44       1.18
## Theta3 for field                                  1.47    1.75      -1.08
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations    14.66     16.868   14.590
## Precision for seaDist                           6579.48  18484.416 5008.739
## Theta1 for field                                  -3.10     -0.211   -1.563
## Theta2 for field                                   1.84      2.877    1.606
## Theta3 for field                                   1.23      5.540    0.021
## 
## Marginal log-Likelihood:  -1254.07 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

We can see in the above summary that the computational cost significantly increased. Let us compare the cost of having rspde.order=3 with the cost of having rspde.order=1:

# order = 1
rspde_fit$cpu.used
##        Pre    Running       Post      Total 
## 0.39814997 3.12240434 0.06321526 3.58376956
# order = 3
rspde_fit_order_3$cpu.used
##         Pre     Running        Post       Total 
##  0.23007226 13.69701767  0.08043146 14.00752139

One can check the order of the rational approximation by using the rational.order() function. It also allows another way to change the order of the rational order, by using the corresponding rational.order<-() function.

The rational.order() and rational.order<-() functions can be applied to the inla.rspde object, to the A matrix and to the index objects.

Here to check the models:

rational.order(rspde_model)
## [1] 1
rational.order(rspde_model_order_3)
## [1] 3

Here to check the A matrices:

## [1] 1
## [1] 3

Here to check the indexes:

rational.order(mesh.index)
## [1] 1
rational.order(mesh.index.3)
## [1] 3

Let us now change the order of the rspde_model object to be 2:

rational.order(rspde_model) <- 2
rational.order(Abar) <- 2
rational.order(mesh.index) <- 2

Let us fit this new model:

 f.s.2 <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model)

stk.dat.2 <- inla.stack(
  data = list(y = Y), A = list(Abar, 1), tag = "est",
  effects = list(
    c(
      mesh.index
    ),
    list(
      long = inla.group(coords[, 1]),
      lat = inla.group(coords[, 2]),
      seaDist = inla.group(seaDist),
      Intercept = 1
    )
  )
)

rspde_fit_order_2 <- inla(f.s.2,
  family = "Gamma", data = inla.stack.data(stk.dat.2),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat.2), compute = TRUE)
)

Here is the summary:

summary(rspde_fit_order_2)
## Time used:
##     Pre = 0.229, Running = 6.7, Post = 0.0517, Total = 6.98 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.862    1.941      2.019 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                    mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.688    1.050      12.73
## Precision for seaDist                          7642.884 4276.781    2339.28
## Theta1 for field                                 -2.518    1.416      -5.72
## Theta2 for field                                  1.753    0.354       1.14
## Theta3 for field                                  0.775    1.114      -1.00
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations   14.652      16.86   14.581
## Precision for seaDist                          6681.218   18595.12 5079.483
## Theta1 for field                                 -2.362      -0.28   -1.612
## Theta2 for field                                  1.729       2.52    1.605
## Theta3 for field                                  0.657       3.29    0.083
## 
## Marginal log-Likelihood:  -1254.40 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Estimating models with fixed smoothness

We can fix the smoothness, say ν\nu, of the model by providing a non-NULL positive value for nu.

When the smoothness, ν\nu, is fixed, we can have two possibilities:

  • α=ν+d/2\alpha = \nu + d/2 is integer;

  • α=ν+d/2\alpha = \nu + d/2 is not integer.

The first case, i.e., when α\alpha is integer, has less computational cost. Furthermore, the AA matrix is different than the AA matrix for the non-integer α\alpha.

The AA matrix is the same for all values of ν\nu such that α\alpha is integer. So, the AA matrix for these cases only need to be computed once. The same holds for the index obtained from the rspde.make.index() function.

In the second case the AA matrix only depends on the order of the rational approximation and not on ν\nu. Therefore, if the matrix AA has already been computed for some rspde.order, then the AA matrix will be same for all the values of ν\nu such that α\alpha is non-integer for that rspde.order. The same holds for the index obtained from the rspde.make.index() function.

If ν\nu is fixed, we have that the parameters returned by R-INLA are $$\kappa = \exp(\theta_1)\quad\hbox{and}\quad\tau = \exp(\theta_2).$$ We will now provide illustrations for both scenarios. It is also noteworthy that just as for the case in which we estimate ν\nu, we can also change the order of the rational approximation by changing the value of rspde.order. For both illustrations with fixed ν\nu below, we will only consider the order of the rational approximation of 1, that is, the default order.

Estimating models with fixed smoothness and non-integer α\alpha

Recall that: ν=νUB(exp(θ3)1+exp(θ3)).\nu = \nu_{UB}\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big). Thus, to illustrate, let us consider a fixed ν\nu given by the mean of ν\nu obtained from the first model we considered in this vignette, namely, the fit given by rspde_fit, which is approximately ν=1.21\nu = 1.21.

Notice that for this ν\nu, the value of α\alpha is non-integer, so we can use the AA matrix and the index of the first fitted model, which is also of order 2.

Therefore, all we have to do is build a new model in which we set nu to 1.21:

rspde_model_fix <- rspde.matern(mesh = prmesh, nu = 1.21)

Let us now fit the model:

f.s.fix <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_fix)

rspde_fix <- inla(f.s.fix,
  family = "Gamma", data = inla.stack.data(stk.dat),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat), compute = TRUE)
)

Here we have the summary:

summary(rspde_fix)
## Time used:
##     Pre = 0.229, Running = 2.76, Post = 0.0399, Total = 3.03 
## Fixed effects:
##           mean    sd 0.025quant 0.5quant 0.975quant mode kld
## Intercept 1.94 0.038      1.865     1.94      2.016 1.94   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                   mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.71    1.049      12.74
## Precision for seaDist                          7630.55 4293.689    2356.70
## Theta1 for field                                 -2.07    0.349      -2.75
## Theta2 for field                                  1.72    0.253       1.22
##                                                0.5quant 0.975quant    mode
## Precision-parameter for the Gamma observations    14.68      16.87   14.63
## Precision for seaDist                           6652.74   18665.19 5051.19
## Theta1 for field                                  -2.07      -1.38   -2.08
## Theta2 for field                                   1.72       2.21    1.72
## 
## Marginal log-Likelihood:  -1255.05 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Now, the summary in the original scale:

result_fix <- rspde.result(rspde_fix, "field", rspde_model_fix)
summary(result_fix)
##           mean        sd 0.025quant 0.5quant 0.975quant     mode
## tau   0.134224 0.0480429  0.0641265 0.126155   0.250771 0.111428
## kappa 5.744190 1.4602400  3.3899300 5.571860   9.093650 5.245360

Estimating models with fixed smoothness and integer α\alpha

Since we are in dimension d=2d=2, and ν>0\nu>0, the smallest value of ν\nu that makes α=ν+1\alpha = \nu + 1 an integer is ν=1\nu=1. This value is also close to the estimated mean of the first model we fitted and enclosed by the posterior marginal density of ν\nu for the first fit. Therefore, let us fit the model with ν=1\nu=1.

To this end we need to compute a new AA matrix:

Abar.int <- rspde.make.A(
  mesh = prmesh, loc = coords,
  nu = 1
)

a new index:

mesh.index.int <- rspde.make.index(
  name = "field", mesh = prmesh,
  nu = 1
)

create a new model (remember to set nu=1):

rspde_model_fix_int1 <- rspde.matern(mesh = prmesh,
  nu = 1)

The remaining is standard:

stk.dat.int <- inla.stack(
  data = list(y = Y), A = list(Abar.int, 1), tag = "est",
  effects = list(
    c(
      mesh.index.int
    ),
    list(
      long = inla.group(coords[, 1]),
      lat = inla.group(coords[, 2]),
      seaDist = inla.group(seaDist),
      Intercept = 1
    )
  )
)

f.s.fix.int.1 <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_fix_int1)

rspde_fix_int_1 <- inla(f.s.fix.int.1,
  family = "Gamma",
  data = inla.stack.data(stk.dat.int), verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(
    A = inla.stack.A(stk.dat.int),
    compute = TRUE
  )
)

Let us check the summary:

summary(rspde_fix_int_1)
## Time used:
##     Pre = 0.227, Running = 0.933, Post = 0.0289, Total = 1.19 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.862    1.941      2.019 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                   mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.68    1.052      12.72
## Precision for seaDist                          7772.49 4453.415    2364.19
## Theta1 for field                                 -1.52    0.327      -2.17
## Theta2 for field                                  1.60    0.283       1.05
##                                                0.5quant 0.975quant    mode
## Precision-parameter for the Gamma observations    14.64      16.86   14.56
## Precision for seaDist                           6743.36   19254.10 5077.48
## Theta1 for field                                  -1.51      -0.88   -1.50
## Theta2 for field                                   1.60       2.16    1.59
## 
## Marginal log-Likelihood:  -1254.80 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

and check the summary in the user’s scale:

rspde_result_int <- rspde.result(rspde_fix_int_1, "field", rspde_model_fix_int1)
summary(rspde_result_int)
##           mean        sd 0.025quant 0.5quant 0.975quant    mode
## tau   0.231311 0.0765171   0.114935 0.220157   0.412696 0.19849
## kappa 5.155430 1.4825900   2.872120 4.942390   8.653630 4.53166

Changing the priors

We begin by recalling that the fitted rSPDE model in R-INLA contains the parameters Theta1\textrm{Theta1}, Theta2\textrm{Theta2} and Theta3\textrm{Theta3}. Let, again, θ1=Theta1\theta_1 = \textrm{Theta1}, θ2=Theta2\theta_2=\textrm{Theta2} and θ3=Theta3\theta_3=\textrm{Theta3}. In terms of the SPDE (κ2IΔ)α/2(τu)=𝒲,(\kappa^2 I - \Delta)^{\alpha/2}(\tau u) = \mathcal{W}, where α=ν+d/2\alpha = \nu + d/2.

We also have the range parameter ρ=8νκ\rho = \frac{\sqrt{8\nu}}{\kappa} and the standard deviation σ=Γ(ν)τ2κ2ν(4π)d/2Γ(ν+d/2)\sigma = \sqrt{\frac{\Gamma(\nu)}{\tau^2 \kappa^{2\nu}(4\pi)^{d/2}\Gamma(\nu + d/2)}}.

Changing the priors of τ\tau and κ\kappa

We begin by dealing with τ\tau and κ\kappa.

We have that τ=exp(θ1),κ=exp(θ2).\tau = \exp(\theta_1),\quad \kappa = \exp(\theta_2). The rspde.matern() function assumes a lognormal prior distribution for τ\tau and κ\kappa. This prior distribution is obtained by assuming that θ1\theta_1 and θ2\theta_2 follow normal distributions. By default we assume θ1\theta_1 and θ2\theta_2 to be independent and to follow normal distributions θ1N(log(τ0),10)\theta_1\sim N(\log(\tau_0), 10) and θ2N(log(κ0),10)\theta_2\sim N(\log(\kappa_0), 10).

κ0\kappa_0 is suitably defined in terms of the mesh and τ0\tau_0 is defined in terms of κ0\kappa_0 and on the prior of the smoothness parameter.

If one wants to define the prior θ1N(mean_theta_1,sd_theta_1),\theta_1 \sim N(\text{mean_theta_1}, \text{sd_theta_1}), one can simply set the argument prior.tau = list(meanlog=mean_theta_1, sdlog=sd_theta_1). Analogously, to define the prior θ2N(mean_theta_2,sd_theta_2),\theta_2 \sim N(\text{mean_theta_2}, \text{sd_theta_2}), one can set the argument prior.kappa = list(meanlog=mean_theta_2, sdlog=sd_theta_2).

It is important to mention that, by default, the initial values of τ\tau and κ\kappa are exp(mean_theta_1)\exp(\text{mean_theta_1}) and exp(mean_theta_2)\exp(\text{mean_theta_2}), respectively. So, if the user does not change these parameters, and also does not change the initial values, the initial values of τ\tau and κ\kappa will be, respectively, τ0\tau_0 and κ0\kappa_0.

If one sets prior.tau = list(meanlog=mean_theta_1), the prior for θ1\theta_1 will be θ1N(mean_theta_1,1),\theta_1 \sim N(\text{mean_theta_1}, 1), whereas, if one sets, prior.tau = list(sdlog=sd_theta_1), the prior will be θ1N(log(τ0),sd_theta_1).\theta_1 \sim N(\log(\tau_0), \text{sd_theta_1}). Analogously, if one sets prior.kappa = list(meanlog=mean_theta_2), the prior for θ2\theta_2 will be θ2N(mean_theta_2,1),\theta_2 \sim N(\text{mean_theta_2}, 1), whereas, if one sets, prior.kappa = list(sdlog=sd_theta_2), the prior will be θ2N(log(κ0),sd_theta_2).\theta_2 \sim N(\log(\kappa_0), \text{sd_theta_2}).

Changing the priors of ρ\rho (range) and σ\sigma (std. dev.)

Let us now consider the priors for the range, ρ\rho, and std. deviation, σ\sigma. This parameterization is used with the argument parameterization = "matern", which is the default.

In this case, we have that σ=exp(θ1),ρ=exp(θ2).\sigma = \exp(\theta_1),\quad \rho = \exp(\theta_2). We have two options for prior. By default, which is the option prior.theta.param = "theta", the rspde.matern() function assumes a lognormal prior distribution for σ\sigma and ρ\rho. This prior distribution is obtained by assuming that θ1\theta_1 and θ2\theta_2 follow normal distributions. By default we assume θ1\theta_1 and θ2\theta_2 to be independent and to follow normal distributions θ1N(log(σ0),10)\theta_1\sim N(\log(\sigma_0), 10) and θ2N(log(ρ0),10)\theta_2\sim N(\log(\rho_0), 10).

ρ0\rho_0 is suitably defined in terms of the mesh and σ0\sigma_0 is defined in terms of ρ0\rho_0 and on the prior of the smoothness parameter.

Similarly to the priors of τ\tau and κ\kappa, if one wants to define the prior θ1N(mean_theta_1,sd_theta_1),\theta_1 \sim N(\text{mean_theta_1}, \text{sd_theta_1}), one can simply set the argument prior.tau = list(meanlog=mean_theta_1, sdlog=sd_theta_1). Analogously, to define the prior θ2N(mean_theta_2,sd_theta_2),\theta_2 \sim N(\text{mean_theta_2}, \text{sd_theta_2}), one can set the argument prior.kappa = list(meanlog=mean_theta_2, sdlog=sd_theta_2).

Another option is to set prior.theta.param = "spde". In this case, a change of variables is performed. So, we assume a lognormal prior on τ\tau and κ\kappa. Then, by the relations ρ=8νκ\rho = \frac{\sqrt{8\nu}}{\kappa} and σ=Γ(ν)τ2κ2ν(4π)d/2Γ(ν+d/2)\sigma = \sqrt{\frac{\Gamma(\nu)}{\tau^2 \kappa^{2\nu}(4\pi)^{d/2}\Gamma(\nu + d/2)}}, we obtain a prior for ρ\rho and σ\sigma.

Changing the prior of ν\nu

Finally, let us consider the smoothness parameter ν\nu.

By default, we assume that ν\nu follows a beta distribution on the interval (0,νUB)(0,\nu_{UB}), where νUB\nu_{UB} is the upper bound for ν\nu, with mean ν0=min{1,νUB/2}\nu_0=\min\{1, \nu_{UB}/2\} and variance ν0(νUBν0)1+ϕ0\frac{\nu_0(\nu_{UB}-\nu_0)}{1+\phi_0}, and we call ϕ0\phi_0 the precision parameter, whose default value is ϕ0=max{νUBν0,νUBνUBν0}+ϕinc.\phi_0 = \max\Big\{\frac{\nu_{UB}}{\nu_0}, \frac{\nu_{UB}}{\nu_{UB}-\nu_0}\Big\} + \phi_{inc}. The parameter ϕinc\phi_{inc} is an increment to ensure that the prior beta density has boundary values equal to zero (where the boundary values are defined either by continuity or by limits). The default value of ϕinc\phi_{inc} is 1. The value of ϕinc\phi_{inc} can be changed by changing the argument nu.prec.inc in the rspde.matern() function. The higher the value of ϕinc\phi_{inc} (that is, the value of nu.prec.inc) the more informative the prior distribution becomes.

Let us denote a beta distribution with support on (0,νUB)(0,\nu_{UB}), mean μ\mu and precision parameter ϕ\phi by νUB(μ,ϕ)\mathcal{B}_{\nu_{UB}}(\mu,\phi).

If we want ν\nu to have a prior ννUB(nu_1,prec_1),\nu \sim \mathcal{B}_{\nu_{UB}}(\text{nu_1},\text{prec_1}), one simply needs to set prior.nu = list(mean=nu_1, prec=prec_1). If one sets prior.nu = list(mean=nu_1), then ν\nu will have prior ννUB(nu_1,ϕ1),\nu \sim \mathcal{B}_{\nu_{UB}}(\text{nu_1},\phi_1), where ϕ1=max{νUBnu_1,νUBνUBnu_1}+nu.prec.inc.\phi_1 = \max\Big\{\frac{\nu_{UB}}{\text{nu_1}}, \frac{\nu_{UB}}{\nu_{UB}-\text{nu_1}}\Big\} + \text{nu.prec.inc}.

Of one sets prior.nu = list(prec=prec_1), then ν\nu will have prior ννUB(ν0,prec_1).\nu\sim \mathcal{B}_{\nu_{UB}}(\nu_0, \text{prec_1}). It is also noteworthy that we have that, in terms of R-INLA’s parameters,

ν=νUB(exp(θ3)1+exp(θ3)).\nu = \nu_{UB}\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big).

It is important to mention that, by default, if a beta prior distribution is chosen for the smoothness parameter ν\nu, then the initial value of ν\nu is the mean of the prior beta distribution. So, if the user does not change this parameter, and also does not change the initial value, the initial value of ν\nu will be min{1,νUB/2}\min\{1,\nu_{UB}/2\}.

We also assume that, in terms of R-INLA’s parameters, ν=νUB(exp(θ3)1+exp(θ3)).\nu = \nu_{UB}\Big(\frac{\exp(\theta_3)}{1+\exp(\theta_3)}\Big).

We can have another possibility of prior distribution for ν\nu, namely, truncated lognormal distribution. The truncated lognormal distribution is defined in the following sense. We assume that log(ν)\log(\nu) has prior distribution given by a truncated normal distribution with support (,log(νUB))(-\infty,\log(\nu_{UB})), where νUB\nu_{UB} is the upper bound for ν\nu, with location parameter μ0=log(ν0)=log(min{1,νUB/2})\mu_0 =\log(\nu_0)= \log\Big(\min\{1,\nu_{UB}/2\}\Big) and scale parameter σ0=1\sigma_0 = 1. More precisely, let Φ(;μ,σ)\Phi(\cdot; \mu,\sigma) stand for the cumulative distribution function (CDF) of a normal distribution with mean μ\mu and standard deviation σ\sigma. Then, log(ν)\log(\nu) has cumulative distribution function given by Flog(ν)(x)=Φ(x;μ0,σ0)Φ(νUB),xνUB,F_{\log(\nu)}(x) = \frac{\Phi(x;\mu_0,\sigma_0)}{\Phi(\nu_{UB})},\quad x\leq \nu_{UB}, and Flog(ν)(x)=1F_{\log(\nu)}(x) = 1 if x>νUBx>\nu_{UB}. We will call μ0\mu_0 and σ0\sigma_0 the log-location and log-scale parameters of ν\nu, respectively, and we say that log(ν)\log(\nu) follows a truncated normal distribution with location parameter μ0\mu_0 and scale parameter σ0\sigma_0.

To change the prior distribution of ν\nu to the truncated lognormal distribution, we need to set the argument prior.nu.dist="lognormal".

To change these parameters in the prior distribution to, say, log_nu_1 and log_sigma_1, one can simply set prior.nu = list(loglocation=log_nu_1, logscale=sigma_1).

If one sets prior.nu = list(loglocation=log_nu_1), the prior for θ3\theta_3 will be a truncated normal normal distribution with location parameter log_nu_1 and scale parameter 1. Analogously, if one sets, prior.nu = list(logscale=sigma_1), the prior for θ3\theta_3 will be a truncated normal distribution with location parameter log(ν0)=log(min{1,νUB/2})\log(\nu_0)= \log\Big(\min\{1,\nu_{UB}/2\}\Big) and scale parameter sigma_1.

It is important to mention that, by default, if a truncated lognormal prior distribution is chosen for the smoothness parameter ν\nu, then the initial value of ν\nu is the exponential of the log-location parameter of ν\nu. So, if the user does not change this parameter, and also does not change the initial value, the initial value of ν\nu will be min{1,νUB/2}\min\{1,\nu_{UB}/2\}.

Let us consider an example with the same dataset used in the first model of this vignette where we change the prior distribution of ν\nu from beta to lognormal.

rspde_model_beta <- rspde.matern(mesh = prmesh, prior.nu.dist = "lognormal")

Since we did not change rspde.order and are not fixing ν\nu, we can use the same AA matrix and same index from the first example.

Therefore, all we have to do is update the formula and fit the model:

f.s.beta <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_beta)

rspde_fit_beta <- inla(f.s.beta,
  family = "Gamma", data = inla.stack.data(stk.dat),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat), compute = TRUE)
)

We have the summary:

summary(rspde_fit_beta)
## Time used:
##     Pre = 0.227, Running = 3.07, Post = 0.0411, Total = 3.33 
## Fixed effects:
##            mean    sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.041       1.86    1.941      2.021 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                    mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.650    1.051     12.699
## Precision for seaDist                          7773.841 4462.306   2230.778
## Theta1 for field                                 -2.038    1.321     -4.983
## Theta2 for field                                  1.661    0.376      0.998
## Theta3 for field                                  0.438    1.092     -1.361
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations    14.61   1.68e+01   14.522
## Precision for seaDist                           6771.16   1.92e+04 5058.205
## Theta1 for field                                  -1.92   1.22e-01   -1.305
## Theta2 for field                                   1.64   2.47e+00    1.530
## Theta3 for field                                   0.34   2.87e+00   -0.149
## 
## Marginal log-Likelihood:  -1254.48 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Also, we can have the summary in the user’s scale:

result_fit_beta <- rspde.result(rspde_fit_beta, "field", rspde_model_beta)
summary(result_fit_beta)
##          mean       sd 0.025quant 0.5quant 0.975quant      mode
## tau   0.25946 0.314596 0.00708378 0.152783    1.12132 0.0121665
## kappa 5.65898 2.333380 2.72519000 5.109360   11.68670 4.2014200
## nu    1.16049 0.422376 0.41046100 1.153730    1.88933 0.8821500

and the plot of the posterior marginal densities

posterior_df_fit_beta <- gg_df(result_fit_beta)

ggplot(posterior_df_fit_beta) + geom_line(aes(x = x, y = y)) + 
facet_wrap(~parameter, scales = "free") + labs(y = "Density")

Changing the starting values

The starting values to be used by R-INLA’s optimization algorithm can be changed by setting the arguments start.ltau, start.lkappa and start.nu.

  • start.ltau will be the initial value for log(τ)\log(\tau), that is, the logarithm of τ\tau.

  • start.lkappa will be the inital value for log(κ)\log(\kappa), that is, the logarithm of κ\kappa.

  • start.nu will be the initial value for ν\nu. Notice that here the initial value is not on the log scale.

One can change the initial value of one or more parameters.

For instance, let us consider the example with precipitation data, rspde.order=3, but change the initial values to the ones close to the fitted value when considering the default rspde.order (which is 1):

rspde_model_order_3_start <- rspde.matern(mesh = prmesh, rspde.order = 3,
  nu.upper.bound = 2,
  start.lkappa = result_fit$summary.log.kappa$mean,
  start.ltau = result_fit$summary.log.tau$mean,
  start.nu = min(result_fit$summary.nu$mean, 2 - 1e-5)
)

Since we already computed the AA matrix and the index for rspde.order=3, all we have to do is to update the formula and fit:

f.s.3.start <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_order_3_start)

rspde_fit_order_3_start <- inla(f.s.3.start,
  family = "Gamma",
  data = inla.stack.data(stk.dat.3),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(
    A = inla.stack.A(stk.dat.3),
    compute = TRUE
  )
)

We have the summary:

summary(rspde_fit_order_3_start)
## Time used:
##     Pre = 0.232, Running = 8.37, Post = 0.063, Total = 8.66 
## Fixed effects:
##            mean    sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.041      1.861    1.941      2.021 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                    mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.682    1.050      12.72
## Precision for seaDist                          7819.817 4449.881    2542.83
## Theta1 for field                                 -1.995    1.798      -5.97
## Theta2 for field                                  1.706    0.529       0.79
## Theta3 for field                                  0.409    1.520      -2.13
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations   14.646   1.69e+01   14.579
## Precision for seaDist                          6762.781   1.94e+04 5125.670
## Theta1 for field                                 -1.842   9.99e-01   -1.084
## Theta2 for field                                  1.672   2.85e+00    1.496
## Theta3 for field                                  0.282   3.77e+00   -0.349
## 
## Marginal log-Likelihood:  -1253.88 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Changing the type of the rational approximation

We have three rational approximations available. The BRASIL algorithm Hofreither (2021), and two “versions” of the Clenshaw-Lord Chebyshev-Pade algorithm, one with lower bound zero and another with the lower bound given in Bolin, Simas, and Xiong (2023).

The type of rational approximation can be chosen by setting the type.rational.approx argument in the rspde.matern function. The BRASIL algorithm corresponds to the choice brasil, the Clenshaw-Lord Chebyshev pade with zero lower bound and non-zero lower bounds are given, respectively, by the choices chebfun and chebfunLB.

Let us fit a model assigning a brasil rational approximation. We will consider a model with the order of the rational approximation being 1:

rspde_model_brasil <- rspde.matern(prmesh, 
              type.rational.approx = "brasil")

f.s.brasil <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_brasil)

rspde_fit_order_1_brasil <- inla(f.s.brasil,
  family = "Gamma", data = inla.stack.data(stk.dat),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat), compute = TRUE)
)

Let us get the summary:

summary(rspde_fit_order_1_brasil)
## Time used:
##     Pre = 0.24, Running = 3.99, Post = 0.0414, Total = 4.27 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.862    1.941       2.02 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                   mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.67    1.050      12.71
## Precision for seaDist                          7770.97 4545.325    2437.84
## Theta1 for field                                 -3.38    2.236      -8.58
## Theta2 for field                                  1.84    0.422       1.16
## Theta3 for field                                  1.51    1.828      -1.17
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations    14.64     16.842   14.575
## Precision for seaDist                           6678.16  19625.221 4999.180
## Theta1 for field                                  -3.07     -0.109   -1.527
## Theta2 for field                                   1.80      2.788    1.584
## Theta3 for field                                   1.27      5.761    0.014
## 
## Marginal log-Likelihood:  -1254.03 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

Finally, similarly to the order of the rational approximation, one can check the order with the rational.type() function, and assign a new type with the rational.type<-() function.

rational.type(rspde_model)
## [1] "chebfun"
rational.type(rspde_model_brasil)
## [1] "brasil"

Let us change the type of the rational approximation on the model with rational approximation of order 3:

rational.type(rspde_model_order_3) <- "brasil"

f.s.3 <- y ~ -1 + Intercept + f(seaDist, model = "rw1") +
  f(field, model = rspde_model_order_3)

rspde_fit_order_3_brasil <- inla(f.s.3,
  family = "Gamma", data = inla.stack.data(stk.dat.3),
  verbose = FALSE,
  control.inla = list(int.strategy = "eb"),
  control.predictor = list(A = inla.stack.A(stk.dat.3), compute = TRUE)
)

Let us get the summary:

summary(rspde_fit_order_3_brasil)
## Time used:
##     Pre = 0.231, Running = 14.7, Post = 0.0844, Total = 15 
## Fixed effects:
##            mean   sd 0.025quant 0.5quant 0.975quant  mode kld
## Intercept 1.941 0.04      1.862    1.941      2.019 1.941   0
## 
## Random effects:
##   Name     Model
##     seaDist RW1 model
##    field CGeneric
## 
## Model hyperparameters:
##                                                   mean       sd 0.025quant
## Precision-parameter for the Gamma observations   14.70    1.051      12.73
## Precision for seaDist                          7756.09 4530.592    2395.77
## Theta1 for field                                 -3.17    2.005      -7.80
## Theta2 for field                                  1.88    0.447       1.16
## Theta3 for field                                  1.26    1.556      -1.08
##                                                0.5quant 0.975quant     mode
## Precision-parameter for the Gamma observations    14.66     16.864   14.594
## Precision for seaDist                           6676.30  19538.959 4995.841
## Theta1 for field                                  -2.90     -0.172   -1.570
## Theta2 for field                                   1.83      2.882    1.608
## Theta3 for field                                   1.05      4.848    0.038
## 
## Marginal log-Likelihood:  -1254.12 
##  is computed 
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')

References

Bolin, David, Alexandre B. Simas, and Zhen Xiong. 2023. “Covariance-Based Rational Approximations of Fractional SPDEs for Computationally Efficient Bayesian Inference.” Journal of Computational and Graphical Statistics.
Hofreither, Clemens. 2021. “An Algorithm for Best Rational Approximation Based on Barycentric Rational Interpolation.” Numerical Algorithms 88 (1): 365–88.
Lindgren, Finn, Håvard Rue, and Johan Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach.” Journal of the Royal Statistical Society. Series B. Statistical Methodology 73 (4): 423–98.