Definitions and computational methodology
David Bolin and Finn Lindgren
Source:vignettes/theory.Rmd
theory.Rmd
Hierarchical models are of great importance in many areas of
statistics. In the simplest form, a hierarchical model has a likelihood
distribution
for observed data
,
which is specified conditionally on a latent process of interest,
,
which has a distribution
.
For Bayesian hierarchical models, one also specifies prior distributions
for the model parameters
.
The most important special case of these models are the LGMs, which are
obtained by assuming that
has a Gaussian distribution (Rue, Martino, and
Chopin 2009). Numerous applications can be studied using models
of this form, and these are therefore the main focus of the methods in
excursions
.
A statistical analysis using an LGM often concludes with reporting
the posterior mean
as a point estimate of the latent field, possibly together with
posterior variances as a measure of uncertainty. In many applications,
however, reporting posterior means and variances are not enough. As
stated in the introduction, one may be interested in computing regions
where the latent field exceeds some given threshold, contour curves with
their associated uncertainty, or simultaneous confidence bands. In some
applications, only a contour map of the posterior mean is reported,
where the number of levels in the contour map should represent the
uncertainty in the estimate. These are quantities that can be computed
with excursions
and we now define these in more detail
before outlining how they can be computed. For details we refer to (Bolin and Lindgren 2015) and (Bolin and Lindgren 2017).
Definitions
The main quantities that can be computed using
excursions
are (1) excursion sets, (2) contour credible
regions and level avoiding sets, (3) excursion functions, (4) contour
maps and their quality measures, (5) simultaneous confidence bands. This
section defines these in more detail.
Throughout the section, will denote a stochastic process defined on some domain of interest, , which we assume is open with a well-defined area . Since it is not necessary for the definitions, we will not explicitly state the dependency on the data, but simply allow to have some possible non-stationary distribution. In practice, however, the distribution of will typically be a posterior distribution conditionally on data, . For frequentist models, the distribution of could also be conditionally on for example a maximum likelihood estimate of the parameters, .
Excursion sets
An excursion set is a set where the process exceeds or goes below some given level of interest, . A where is referred to as a positive excursion set, whereas a set where is referred to as a negative excursion set. If is a known function, these sets can be computed directly as and respectively. If is a latent random process, one can only provide a region where it with some (high) probability exceeds the level. More specifically, the positive level excursion set with probability , , is defined as the largest set so that with probability the level is exceeded at all locations in the set, Similarly, the negative excursion set with probability , , is defined as the largest set so that with probability the process is below the level at all locations in the set. This set is obtained by replacing with in the equation above.
Contour credible regions and level avoiding sets
For a function , a contour curve (or set in general) of a level is defined as the set of all level crossings. Formally, the level curve is defined as , where denotes the interior of the set and denotes the complement. Note that not only includes the set of locations where , but also all discontinuous level crossings.
For a latent random process , one can only provide a credible region for the contour curve. A level contour credibility region, , is defined as the smallest set such that with probability level crossings of are in the set. This set can be seen as the complement of the level contour avoiding set , which is defined as the largest union , where jointly in and in . Formally, where the sets are open. The sets and are denoted as the pair of level avoiding sets. The notion of level avoiding sets can naturally be extended to multiple levels , which is needed when studying contour maps. In this case, the multilevel contour avoiding set is denoted (For a formal definition, see (Bolin and Lindgren 2017)).
Excursion functions
introduced excursion functions as a tool for visualizing excursion sets simultaneously for all values of . For a level , the positive and negative excursion functions are defined as and , respectively. Similarly, the contour avoidance function, and the contour function are defined as and , respectively. Finally, for levels , one can define a contour map function as .
These functions take values between zero and one and each set can be retrieved as the excursion set of the function .
Contour maps and their quality measures
For a function , a contour map with contour levels is defined as the collection of contour curves and associated level sets , for , where one defines and . In practice, a process is often visualized using a contour map of the posterior mean . The contour maps is visualized either by just drawing the contour curves labeled by their values, or by also visualizing each level set in a specific color. The color for a set is typically chosen as the color that corresponds to the level in a given color map.
In order to choose an appropriate number of contours, one must be able to quantify the uncertainty of contour maps. The uncertainty can be represented using a contour map quality measure , which is a function that takes values in . Here, should be chosen in such a way that indicates that the contour map, in some sense, is appropriate as a description of the distribution of the random field, whereas should indicate that the contour map is inappropriate.
An example of a contour map quality measure is the normalized integral of the contour map function The most useful quality measure is denoted and is defined as the simultaneous probability for the level crossings of all falling within their respective level sets .
An intuitively interpretable approach for choosing the number of
contours in a contour map is to find the largest K such that
is above some threshold. For a joint credibility of
,
say, choose the largest number of contours such that
.
How this can be done using excursions
is illustrated in
Section~.
Simultaneous confidence bands
Especially for time series applications, the uncertainty in the latent process is often visualized using pointwise confidence bands. A pointwise confidence interval for at some location is given by , where denotes the -quantile in the marginal distribution of .
A problem with using pointwise confidence bands is that there is not joint interpretation, and one is therefore often of interested in computing simultaneous confidence bands. For a process , we define a simultaneous confidence band as the region . Here is chosen such that . Thus controls the probability that the process is inside the confidence band at all locations in .
Computational methods
If the latent process is defined on a continuous domain , one has to use a discretization, , of the process in the statistical analysis. The vector may be the process evaluated at some locations of interest or more generally the weights in a basis expansion . Computations in the package are in a first stage performed using the distribution of . If is continuous, computations in a second stage interpolates the results for to . In this section, we briefly outline the computational methods used in these two stages. As most quantities of interest can be obtained using some excursion function, we focus on how these are computed. As a representative example, we outline how is computed in the following sections.
As before, let and be a vectors respectively containing observations and model parameters. Computing an excursion function requires computing integrals of the posterior distribution for . To save computation time, it is assumed that if . This means that can be obtained by first reordering the nodes and then computing a sequential integral. The reordering is in this case obtained by sorting the marginal probabilities (for other options, see (Bolin and Lindgren 2015). After reordering, the :th element of is obtained as the integral Using sequential importance sampling as described below, the whole sequence of integrals can be obtained with the same cost as computing only one integral with , making the computation of feasible also for large problems.
Gaussian integrals
The basis for the computational methods in the package is the ability to compute the required integral when the posterior distribution is Gaussian. In this case, one should compute an integral Here and are the posterior mean and posterior precision matrix respectively. To take advantage of the possible sparsity of if a Markovian model is used, the integral is rewritten as where, if the problem has a Markov structure, only depends on a few of the elements in given by the Markov structure. The integral is calculated using a sequential importance sampler by starting with the integral of the last component and then moving backward in the indices, see for further details.
Handling non-Gaussian data
Using the sequential importance sampler above,
can be computed for Gaussian models with known parameters. For more
complicated models, the latent Gaussian structure has to be handled, and
this can be done in different ways depending on the accuracy that is
needed. excursions
currently supports the following five
methods described in detail in (Bolin and
Lindgren 2015): Empirical Bayes (EB
), Quantile
Correction (QC
), Numerical Integration (NI
),
Numerical integration with Quantile Corrections (NIQC
), and
improved Numerical integration with Quantile Corrections
(iNIQC
).
The EB
method is the simplest method and is based on
using a Gaussian approximation of the posterior,
.
The QC
method uses the same Gaussian approximation but
modifies the limits in the integral to improve the accuracy. The three
other methods are intended for Bayesian models, where the posterior is
obtained by integrating over the parameters,
The NI
method approximates the integration with respect to the parameters as in
the INLA method, using a sum of representative parameter configurations,
and the NIQC
and iNIQC
methods combines this
with the QC
method to improve the accuracy further.
In general, EB
and QC
are suitable for
frequentist models and for Bayesian models where the posterior
distribution conditionally on the parameters is approximately Gaussian.
The methods are equivalent if the posterior is Gaussian and in other
cases QC
is more accurate than EB
. For
Bayesian models, the NI
method is, in general, more
accurate than the QC
method, and for non-Gaussian
likelihoods, the NIQC
and iNIQC
methods can be
used for improved results. In general the accuracy and computational
cost of the methods are as follows
Accuracy: EB
QC
NI
NIQC
iNIQC
Computational cost: EB
QC
NI
NIQC
iNIQC
.
If the main purpose of the analysis is to construct excursion or
contour sets for low values of
,
we recommend using the QC
method for problems with Gaussian
likelihoods and the NIQC
method for problems with
non-Gaussian likelihoods. The increase in accuracy of the
iNIQC
method is often small compared to the added
computational cost.
Continuous domain interpolations
For a continuous spatial domain, the excursion function
can be approximated using
computed at discrete point locations. The main idea is to interpolate
assuming monotonicity of the random field between the discrete
computation locations. Specifically, assume that the values of
correspond to the values at the vertices of some triangulated mesh. If
the excursion set
should be computed for some specific value of
,
one has to find the
contour for
.
For interpolation to a location
within a specific triangle
with corners in
and
,
excursions
by default uses log-linear interpolation,
.
Here
are the barycentric coordinates of
within the triangle.
Further technical details of the continuous domain construction are given in (Bolin and Lindgren 2017). Studies of the resulting continuous domain excursion sets in (Bolin and Lindgren 2017) indicate that the log-linear interpolation method results in sets with coverage probability on target or slightly above target for large target probabilities.