Skip to contents

Hierarchical models are of great importance in many areas of statistics. In the simplest form, a hierarchical model has a likelihood distribution π(Y|X,θ)\pi({\boldsymbol{\mathrm{Y}}}|{\boldsymbol{\mathrm{X}}}, {\boldsymbol{\mathrm{\theta}}}) for observed data Y{\boldsymbol{\mathrm{Y}}}, which is specified conditionally on a latent process of interest, X{\boldsymbol{\mathrm{X}}}, which has a distribution π(X|θ)\pi({\boldsymbol{\mathrm{X}}}|{\boldsymbol{\mathrm{\theta}}}). For Bayesian hierarchical models, one also specifies prior distributions for the model parameters θ{\boldsymbol{\mathrm{\theta}}}. The most important special case of these models are the LGMs, which are obtained by assuming that X|θ{\boldsymbol{\mathrm{X}}}|{\boldsymbol{\mathrm{\theta}}} has a Gaussian distribution (Rue, Martino, and Chopin 2009). Numerous applications can be studied using models of this form, and these are therefore the main focus of the methods in excursions.

A statistical analysis using an LGM often concludes with reporting the posterior mean E(X|Y)E({\boldsymbol{\mathrm{X}}}|{\boldsymbol{\mathrm{Y}}}) as a point estimate of the latent field, possibly together with posterior variances as a measure of uncertainty. In many applications, however, reporting posterior means and variances are not enough. As stated in the introduction, one may be interested in computing regions where the latent field exceeds some given threshold, contour curves with their associated uncertainty, or simultaneous confidence bands. In some applications, only a contour map of the posterior mean is reported, where the number of levels in the contour map should represent the uncertainty in the estimate. These are quantities that can be computed with excursions and we now define these in more detail before outlining how they can be computed. For details we refer to (Bolin and Lindgren 2015) and (Bolin and Lindgren 2017).

Definitions

The main quantities that can be computed using excursions are (1) excursion sets, (2) contour credible regions and level avoiding sets, (3) excursion functions, (4) contour maps and their quality measures, (5) simultaneous confidence bands. This section defines these in more detail.

Throughout the section, X(s)X({\boldsymbol{\mathrm{s}}}) will denote a stochastic process defined on some domain of interest, Ω\Omega, which we assume is open with a well-defined area |Ω|<|\Omega|<\infty. Since it is not necessary for the definitions, we will not explicitly state the dependency on the data, but simply allow XX to have some possible non-stationary distribution. In practice, however, the distribution of XX will typically be a posterior distribution conditionally on data, X(s)|YX({\boldsymbol{\mathrm{s}}})|{\boldsymbol{\mathrm{Y}}}. For frequentist models, the distribution of XX could also be conditionally on for example a maximum likelihood estimate of the parameters, X(s)|Y,θ̂X({\boldsymbol{\mathrm{s}}})|{\boldsymbol{\mathrm{Y}}},\widehat{{\boldsymbol{\mathrm{\theta}}}}.

Excursion sets

An excursion set is a set where the process X(s)X({\boldsymbol{\mathrm{s}}}) exceeds or goes below some given level of interest, uu. A where X(s)>uX({\boldsymbol{\mathrm{s}}})>u is referred to as a positive excursion set, whereas a set where X(s)<uX({\boldsymbol{\mathrm{s}}})<u is referred to as a negative excursion set. If X(s)=f(s)X({\boldsymbol{\mathrm{s}}})=f({\boldsymbol{\mathrm{s}}}) is a known function, these sets can be computed directly as Au+(f)={sΩ;f(s)>u}A_u^+(f) = \{ {\boldsymbol{\mathrm{s}}}\in\Omega; f({\boldsymbol{\mathrm{s}}})>u \} and Au(f)={sΩ;f(s)<u}A_u^-(f) = \{ {\boldsymbol{\mathrm{s}}}\in\Omega; f({\boldsymbol{\mathrm{s}}})<u \} respectively. If X(s)X({\boldsymbol{\mathrm{s}}}) is a latent random process, one can only provide a region where it with some (high) probability exceeds the level. More specifically, the positive level uu excursion set with probability α\alpha, Eu,α+(X)E_{{u,\alpha}}^{{+}}(X), is defined as the largest set so that with probability 1α1-\alpha the level uu is exceeded at all locations in the set, Eu,α+=arg maxD{|D|:P[DAu+(X)]1α}. E_{{u,\alpha}}^{{+}} = \mathop{\mathrm{arg\,max}}_{D}\{|D|:P[D\subset A_u^+(X)]\geq 1 - \alpha\}. Similarly, the negative uu excursion set with probability α\alpha, Eu,α(X)E_{{u,\alpha}}^{{-}}(X), is defined as the largest set so that with probability 1α1-\alpha the process is below the level uu at all locations in the set. This set is obtained by replacing Au+(X)A_u^+(X) with Au(X)A_u^-(X) in the equation above.

Contour credible regions and level avoiding sets

For a function ff, a contour curve (or set in general) of a level uu is defined as the set of all level uu crossings. Formally, the level curve is defined as Auc(f)=(Au+(f)oAu(f)o)cA_u^c(f) = \left(A_u^+(f)^o\cup A_u^-(f)^o\right)^c, where BoB^o denotes the interior of the set BB and BcB^c denotes the complement. Note that Auc(f)A_u^c(f) not only includes the set of locations where f(s)=uf({\boldsymbol{\mathrm{s}}})=u, but also all discontinuous level crossings.

For a latent random process XX, one can only provide a credible region for the contour curve. A level uu contour credibility region, Eu,αc(X)E_{{u,\alpha}}^{{c}}(X), is defined as the smallest set such that with probability 1α1-\alpha level uu crossings of XX are in the set. This set can be seen as the complement of the level uu contour avoiding set Eu,α(X)E_{{u,\alpha}}^{{}}(X), which is defined as the largest union Mu,α+Mu,αM_{u,\alpha}^+ \cup M_{u,\alpha}^-, where jointly X(s)>uX({\boldsymbol{\mathrm{s}}})>u in Mu,α+M_{u,\alpha}^+ and X(s)<uX({\boldsymbol{\mathrm{s}}})<u in Mu,αM_{u,\alpha}^-. Formally, (Mu,α+(X),Mu,α(X))=arg max(D+,D){|DD+|:P(DAu(X),D+Au+(X))1α},\begin{equation*} (M_{u,\alpha}^+(X), M_{u,\alpha}^-(X)) = \mathop{\mathrm{arg\,max}}_{(D^+,D^-)}\{|D^-\cup D^+| : P(D^- \subseteq A_u^-(X),\, D^+ \subseteq A_u^+(X)) \geq 1-\alpha\}, \end{equation*} where the sets (D+,D)(D^+,D^-) are open. The sets Mu,α+M_{u,\alpha}^+ and Mu,αM_{u,\alpha}^- are denoted as the pair of level uu avoiding sets. The notion of level avoiding sets can naturally be extended to multiple levels u1<u2<<uku_1< u_2 < \cdots < u_k, which is needed when studying contour maps. In this case, the multilevel contour avoiding set is denoted Cu,α(X)C_{{\boldsymbol{\mathrm{u}}},\alpha}(X) (For a formal definition, see (Bolin and Lindgren 2017)).

Excursion functions

introduced excursion functions as a tool for visualizing excursion sets simultaneously for all values of α\alpha. For a level uu, the positive and negative excursion functions are defined as Fu+(s)=1inf{α;sEu,α+}F_u^+({\boldsymbol{\mathrm{s}}}) = 1 - \inf\{\alpha ; {\boldsymbol{\mathrm{s}}}\in E_{{u,\alpha}}^{{+}} \} and Fu(s)=1inf{α;sEu,α}F_u^-({\boldsymbol{\mathrm{s}}}) = 1 - \inf\{\alpha ; {\boldsymbol{\mathrm{s}}}\in E_{{u,\alpha}}^{{-}} \}, respectively. Similarly, the contour avoidance function, and the contour function are defined as Fu(s)=1inf{α;sEu,α}F_u({\boldsymbol{\mathrm{s}}}) = 1 -\inf\{\alpha ; {\boldsymbol{\mathrm{s}}}\in E_{{u,\alpha}}^{{}} \} and Fuc(s)=sup{α;sEu,αc}F_u^c({\boldsymbol{\mathrm{s}}}) = \sup\{\alpha; {\boldsymbol{\mathrm{s}}}\in E_{{u,\alpha}}^{{c}}\}, respectively. Finally, for levels u1<u2<<uku_1< u_2 < \cdots < u_k, one can define a contour map function as F(s)=sup{1α;sCu,α}F({\boldsymbol{\mathrm{s}}}) = \sup\{1-\alpha; {\boldsymbol{\mathrm{s}}}\in C_{u,\alpha}\}.

These functions take values between zero and one and each set Eu,αE_{{u,\alpha}}^{{\bullet}} can be retrieved as the 1α1-\alpha excursion set of the function Fu(s)F_u^{\bullet}({\boldsymbol{\mathrm{s}}}).

Contour maps and their quality measures

For a function f(s)f({\boldsymbol{\mathrm{s}}}), a contour map CfC_f with contour levels u1<u2<<uKu_1 < u_2 < \ldots < u_K is defined as the collection of contour curves Au1c(f),,AuKc(f)A_{u_1}^c(f), \ldots, A_{u_K}^c(f) and associated level sets Gk={s:uk<f(s)<uk+1}G_k = \{{\boldsymbol{\mathrm{s}}} : u_{k}< f({\boldsymbol{\mathrm{s}}}) < u_{k+1}\}, for 0kK0\leq k\leq K, where one defines u0=u_0 = -\infty and uK+1=u_{K+1}=\infty. In practice, a process X(s)X({\boldsymbol{\mathrm{s}}}) is often visualized using a contour map of the posterior mean E(X(s)|Y)E(X({\boldsymbol{\mathrm{s}}})|{\boldsymbol{\mathrm{Y}}}). The contour maps is visualized either by just drawing the contour curves labeled by their values, or by also visualizing each level set in a specific color. The color for a set GkG_k is typically chosen as the color that corresponds to the level uke=(uk+uk+1)/2u_k^e = (u_k+u_{k+1})/2 in a given color map.

In order to choose an appropriate number of contours, one must be able to quantify the uncertainty of contour maps. The uncertainty can be represented using a contour map quality measure PP, which is a function that takes values in [0,1][0, 1]. Here, PP should be chosen in such a way that P1P \approx 1 indicates that the contour map, in some sense, is appropriate as a description of the distribution of the random field, whereas P0P \approx 0 should indicate that the contour map is inappropriate.

An example of a contour map quality measure is the normalized integral of the contour map function P0(X,Cf)=1|Ω|ΩF(s)ds.\begin{equation} P_0(X, C_f) = \frac1{|\Omega|}\int_{\Omega}F({\boldsymbol{\mathrm{s}}})d{\boldsymbol{\mathrm{s}}}. \end{equation} The most useful quality measure is denoted P2P_2 and is defined as the simultaneous probability for the level crossings of (u1e,,uKe)(u_1^e, \ldots , u_K^e) all falling within their respective level sets (G1,,GK)(G_1, \ldots, G_K) .

An intuitively interpretable approach for choosing the number of contours in a contour map is to find the largest K such that P2P_2 is above some threshold. For a joint credibility of 90%90\%, say, choose the largest number of contours such that P20.9P_2 \geq 0.9. How this can be done using excursions is illustrated in Section~.

Simultaneous confidence bands

Especially for time series applications, the uncertainty in the latent process is often visualized using pointwise confidence bands. A pointwise confidence interval for XX at some location s{\boldsymbol{\mathrm{s}}} is given by [qα/2(s),q1α/2(s)][q_{\alpha/2}({\boldsymbol{\mathrm{s}}}),q_{1-\alpha/2}({\boldsymbol{\mathrm{s}}})], where qα(s)q_{\alpha}({\boldsymbol{\mathrm{s}}}) denotes the α\alpha-quantile in the marginal distribution of X(s)X({\boldsymbol{\mathrm{s}}}).

A problem with using pointwise confidence bands is that there is not joint interpretation, and one is therefore often of interested in computing simultaneous confidence bands. For a process X(s),sΩX({\boldsymbol{\mathrm{s}}}), {\boldsymbol{\mathrm{s}}}\in \Omega, we define a simultaneous confidence band as the region {(s,y):sΩ,qρ(s)yq1ρ(s)}\{({\boldsymbol{\mathrm{s}}},y): {\boldsymbol{\mathrm{s}}}\in \Omega, q_{\rho}({\boldsymbol{\mathrm{s}}}) \leq y \leq q_{1-\rho}({\boldsymbol{\mathrm{s}}})\}. Here ρ\rho is chosen such that P(qρ(s)<X(s)<q1ρ(s),sΩ)=1αP(q_{\rho}({\boldsymbol{\mathrm{s}}})<X({\boldsymbol{\mathrm{s}}}) <q_{1-\rho}({\boldsymbol{\mathrm{s}}}), {\boldsymbol{\mathrm{s}}}\in \Omega)=1-\alpha. Thus α\alpha controls the probability that the process is inside the confidence band at all locations in Ω\Omega.

Computational methods

If the latent process X(s)X({\boldsymbol{\mathrm{s}}}) is defined on a continuous domain Ω\Omega, one has to use a discretization, x{\boldsymbol{\mathrm{x}}}, of the process in the statistical analysis. The vector x{\boldsymbol{\mathrm{x}}} may be the process evaluated at some locations of interest or more generally the weights in a basis expansion X(s)=iφi(s)xiX({\boldsymbol{\mathrm{s}}}) = \sum_i \varphi_i({\boldsymbol{\mathrm{s}}}) x_i. Computations in the package are in a first stage performed using the distribution of x{\boldsymbol{\mathrm{x}}}. If Ω\Omega is continuous, computations in a second stage interpolates the results for x{\boldsymbol{\mathrm{x}}} to Ω\Omega. In this section, we briefly outline the computational methods used in these two stages. As most quantities of interest can be obtained using some excursion function, we focus on how these are computed. As a representative example, we outline how Fu+(s)F_u^+({\boldsymbol{\mathrm{s}}}) is computed in the following sections.

As before, let Y{\boldsymbol{\mathrm{Y}}} and θ{\boldsymbol{\mathrm{\theta}}} be a vectors respectively containing observations and model parameters. Computing an excursion function Fu+={Fu+(x1),,Fu+(xn)}{\boldsymbol{\mathrm{F}}}_u^+ = \{F_u^+(x_1), \ldots, F_u^+(x_n)\} requires computing integrals of the posterior distribution for x{\boldsymbol{\mathrm{x}}}. To save computation time, it is assumed that Eu,α1+Eu,α2+E_{{u,\alpha_1}}^{{+}} \subset E_{{u,\alpha_2}}^{{+}} if α1>α2\alpha_1 > \alpha_2. This means that Fu{\boldsymbol{\mathrm{F}}}_u can be obtained by first reordering the nodes and then computing a sequential integral. The reordering is in this case obtained by sorting the marginal probabilities P(xi>u)P(x_i > u) (for other options, see (Bolin and Lindgren 2015). After reordering, the ii:th element of Fu+{\boldsymbol{\mathrm{F}}}_u^+ is obtained as the integral uπ(x1:i|Y)dx1:i. \int_u^{\infty}\pi({\boldsymbol{\mathrm{x}}}_{1:i}|{\boldsymbol{\mathrm{Y}}}) d{\boldsymbol{\mathrm{x}}}_{1:i}. Using sequential importance sampling as described below, the whole sequence of integrals can be obtained with the same cost as computing only one integral with i=ni=n, making the computation of Fu+{\boldsymbol{\mathrm{F}}}_u^+ feasible also for large problems.

Gaussian integrals

The basis for the computational methods in the package is the ability to compute the required integral when the posterior distribution is Gaussian. In this case, one should compute an integral |Q|1/2(2π)d/2uμxexp(12xQx)dx.\begin{equation}\label{eq:markovint} \frac{|{\boldsymbol{\mathrm{Q}}}|^{1/2}}{(2\pi)^{d/2}}\int_{{\boldsymbol{\mathrm{u}}}-{\boldsymbol{\mathrm{\mu}}}\leq {\boldsymbol{\mathrm{x}}}}\exp\left(-\frac{1}{2}{\boldsymbol{\mathrm{x}}}^{\top}{\boldsymbol{\mathrm{Q}}} {\boldsymbol{\mathrm{x}}}\right) \,\mathrm{d}{\boldsymbol{\mathrm{x}}}. \end{equation} Here μ{\boldsymbol{\mathrm{\mu}}} and Q{\boldsymbol{\mathrm{Q}}} are the posterior mean and posterior precision matrix respectively. To take advantage of the possible sparsity of Q{\boldsymbol{\mathrm{Q}}} if a Markovian model is used, the integral is rewritten as adπ(xd)ad1π(xd1|xd)a2π(x2|x3:d)a1π(x1|x2:d)dx\begin{equation}\label{eq:seqint} \int_{a_d}^{\infty}\pi(x_d) \int_{a_{d-1}}^{\infty}\pi(x_{d-1}|x_d) \cdots \int_{a_2}^{\infty}\pi(x_2|{\boldsymbol{\mathrm{x}}}_{3:d}) \int_{a_1}^{\infty} \pi(x_1|{\boldsymbol{\mathrm{x}}}_{2:d}) \,\mathrm{d}{\boldsymbol{\mathrm{x}}} \end{equation} where, if the problem has a Markov structure, xi|xi+1:dx_i|{\boldsymbol{\mathrm{x}}}_{i+1:d} only depends on a few of the elements in xi+1:dx_{i+1:d} given by the Markov structure. The integral is calculated using a sequential importance sampler by starting with the integral of the last component π(xd)\pi(x_d) and then moving backward in the indices, see for further details.

Handling non-Gaussian data

Using the sequential importance sampler above, Fu+{\boldsymbol{\mathrm{F}}}_u^+ can be computed for Gaussian models with known parameters. For more complicated models, the latent Gaussian structure has to be handled, and this can be done in different ways depending on the accuracy that is needed. excursions currently supports the following five methods described in detail in (Bolin and Lindgren 2015): Empirical Bayes (EB), Quantile Correction (QC), Numerical Integration (NI), Numerical integration with Quantile Corrections (NIQC), and improved Numerical integration with Quantile Corrections (iNIQC).

The EB method is the simplest method and is based on using a Gaussian approximation of the posterior, π(x|Y)πG(x|Y,θ̂)\pi({\boldsymbol{\mathrm{x}}}|{\boldsymbol{\mathrm{Y}}}) \approx \pi_G({\boldsymbol{\mathrm{x}}}|{\boldsymbol{\mathrm{Y}}},\widehat{{\boldsymbol{\mathrm{\theta}}}}). The QC method uses the same Gaussian approximation but modifies the limits in the integral to improve the accuracy. The three other methods are intended for Bayesian models, where the posterior is obtained by integrating over the parameters, π(xY)=π(xY,θ)π(θY)dθ.\begin{equation*} \pi({\boldsymbol{\mathrm{x}}}\mid{\boldsymbol{\mathrm{Y}}}) = \int \pi({\boldsymbol{\mathrm{x}}}\mid{\boldsymbol{\mathrm{Y}}},{\boldsymbol{\mathrm{\theta}}})\pi({\boldsymbol{\mathrm{\theta}}}\mid{\boldsymbol{\mathrm{Y}}})d{\boldsymbol{\mathrm{\theta}}}. \end{equation*} The NI method approximates the integration with respect to the parameters as in the INLA method, using a sum of representative parameter configurations, and the NIQC and iNIQC methods combines this with the QC method to improve the accuracy further.

In general, EB and QC are suitable for frequentist models and for Bayesian models where the posterior distribution conditionally on the parameters is approximately Gaussian. The methods are equivalent if the posterior is Gaussian and in other cases QC is more accurate than EB. For Bayesian models, the NI method is, in general, more accurate than the QC method, and for non-Gaussian likelihoods, the NIQC and iNIQC methods can be used for improved results. In general the accuracy and computational cost of the methods are as follows

Accuracy: EB <<QC <<NI <<NIQC <<iNIQC

Computational cost: EB \approxQC <<NI \approxNIQC <<iNIQC.

If the main purpose of the analysis is to construct excursion or contour sets for low values of α\alpha, we recommend using the QC method for problems with Gaussian likelihoods and the NIQC method for problems with non-Gaussian likelihoods. The increase in accuracy of the iNIQC method is often small compared to the added computational cost.

Continuous domain interpolations

For a continuous spatial domain, the excursion function Fu+(s)F_u^+({\boldsymbol{\mathrm{s}}}) can be approximated using Fu+{\boldsymbol{\mathrm{F}}}_u^+ computed at discrete point locations. The main idea is to interpolate Fu+{\boldsymbol{\mathrm{F}}}_u^+ assuming monotonicity of the random field between the discrete computation locations. Specifically, assume that the values of Fu+{\boldsymbol{\mathrm{F}}}_u^+ correspond to the values at the vertices of some triangulated mesh. If the excursion set Eu,α+(X)E_{{u,\alpha}}^{{+}}(X) should be computed for some specific value of α\alpha, one has to find the 1α1-\alpha contour for Fu+(s)F_u^+({\boldsymbol{\mathrm{s}}}). For interpolation to a location s{\boldsymbol{\mathrm{s}}} within a specific triangle 𝒯\mathcal{T} with corners in s1,s2,{\boldsymbol{\mathrm{s}}}_1, {\boldsymbol{\mathrm{s}}}_2, and s3{\boldsymbol{\mathrm{s}}}_3, excursions by default uses log-linear interpolation, Fu(s)=exp{k=13wklog[Fu(sk)]}F_u({\boldsymbol{\mathrm{s}}}) = \exp\{\sum_{k=1}^3 w_k\log[F_u({\boldsymbol{\mathrm{s}}}_k)]\}. Here {(w1,w2,w3);w1,w2,w30,k=13wi=1}\{(w_1,w_2,w_3);\, w_1,w_2,w_3\geq 0,\, \sum_{k=1}^3 w_i=1\} are the barycentric coordinates of s{\boldsymbol{\mathrm{s}}} within the triangle.

Further technical details of the continuous domain construction are given in (Bolin and Lindgren 2017). Studies of the resulting continuous domain excursion sets in (Bolin and Lindgren 2017) indicate that the log-linear interpolation method results in sets with coverage probability on target or slightly above target for large target probabilities.

References

Bolin, David, and Finn Lindgren. 2015. “Excursion and Contour Uncertainty Regions for Latent Gaussian Models.” Journal of the Royal Statistical Society B 77: 85–106. https://doi.org/10.1111/rssb.12055.
———. 2017. “Quantifying the Uncertainty of Contour Maps.” Journal of Computational and Graphical Statistics 26 (3): 513–24. https://doi.org/10.1080/10618600.2016.1228537.
Rue, H, S Martino, and N Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with Discussion).” Journal of the Royal Statistical Society B 71 (2): 319–92. https://doi.org/10.1111/j.1467-9868.2008.00700.x.