Time series analysis is often used to model temporal dependencies in data, where observations at different points in time are correlated.
The autoregressive model of order 1 (AR(1)) provides
a simple but widely used temporal framework that is capable of
predicting a future value from the current value plus a random error
term. Most AR(1) implementations traditionally assume that random errors
follow a normal (Gaussian) distribution, and these implementations are
typically fitted using packages like forecast
or
stats
.
The ngme2
package extends traditional AR(1) models by
allowing for non-Gaussian noise distributions, particularly the Normal
Inverse Gaussian (NIG) distribution: a flexible distribution allowing
for skewness and heavy tails, which can capture the asymmetric and
heavy-tailed behavior commonly observed in real-world time series data.
The package also supports the Generalized Asymmetric Laplace (GAL)
distribution: a flexible distribution accommodating peakedness,
skewness, and heavy tails.
Additionally, ngme2
enables the use of AR(1) processes
as latent models within linear regression frameworks with mixed effects.
This capability allows researchers to model complex hierarchical data
structures where temporal dependencies exist at the latent level, while
simultaneously accounting for fixed effects and other random effects.
This integration makes it possible to capture both the temporal
correlation structure through the AR(1) latent process and the
heterogeneity across different groups or experimental units through
mixed effects modeling.
This vignette provides a comprehensive guide to implementing and
using AR(1) models within the ngme2
framework.
An autoregressive model of order 1 (AR(1)) is a stochastic process where each observation depends linearly on the immediately preceding observation plus a random error term. The AR(1) process is mathematically defined as:
where:
Note on Innovation Distributions: For non-Gaussian innovations, mean and variance do not fully specify the distribution, but are sufficient for the classical exposition of AR models:
Note on Markov Property: The AR(1) process exhibits the first-order Markov property, meaning that the conditional distribution of any observation given the entire past depends only on the immediate predecessor . Formally, for any :
This property emerges directly from the recursive structure combined with the independence of innovations . Since is independent of all past values and is completely determined by and , knowledge of the entire history provides no additional information beyond for predicting . This Markov structure is fundamental to the computational efficiency of AR(1) models, as it implies that the precision matrix has a sparse, banded structure that enables efficient numerical algorithms.
Note on Initial Condition: The scaling factor in the initial condition ensures that has the correct marginal variance for stationarity. Specifically, This guarantees that the initial variance matches the stationary distribution of the AR(1) process, regardless of the value of .
Stationarity Condition: The constraint ensures that the process is stationary, meaning its statistical properties remain constant over time.
Variance Structure: In the stationary distribution, all observations have the same marginal variance .
Autocorrelation Function: For a stationary AR(1) process, the autocorrelation function exhibits exponential decay: where is the lag. This characteristic pattern distinguishes AR(1) processes from other time series models.
Partial Autocorrelation Function: The partial autocorrelation function (PACF) for an AR(1) process has a distinctive pattern: This cutoff after lag 1 is a key diagnostic feature for identifying AR(1) processes in practice.
Noise Distributions: In ngme2
, the
innovation terms
can follow either:
Gaussian distribution: Traditional approach with symmetric, light-tailed errors;
Normal Inverse Gaussian (NIG) distribution: Flexible distribution allowing for skewness and heavy tails;
Generalized Asymmetric Laplace (GAL) distribution: Flexible distribution accommodating peakedness, skewness, and heavy tails.
Irregular Time Grids: The ngme2
package
supports AR(1) processes on irregular time grids, allowing for gaps in
the time series. This is particularly useful for modeling data with
missing time points or irregular sampling intervals.
The AR(1) process can be expressed in matrix form as: where , , and is a matrix that encodes the AR(1) structure.
The matrix is:
Justification of Matrix Construction: The first row with ensures that has the correct marginal variance for stationarity. The subsequent rows implement the recursive relationship by rearranging to .
The matrix relates the correlated AR(1) process values to the independent noise terms . This structure allows us to express the precision matrix (inverse of the covariance matrix) of the AR(1) as $\mathbf{Q} = \mathbf{K}^T \mathbf{K}/\sigmaห2$, meaning that serves as a Cholesky-like factor of . Furthermore, the resulting precision matrix inherits the sparse structure from the Markov property of AR(1) processes, leading to significant computational advantages in terms of both memory usage and processing speed, particularly when dealing with long time series.
For the AR(1) process with the above matrix and error variance , the precision matrix is:
Interpretation: The precision matrix encodes the conditional independence structure of the AR(1) process. The tridiagonal structure (except for the first row/column) reflects the Markov property: each observation is conditionally independent of all others given its immediate neighbors.
This matrix formulation not only establishes a direct connection to
the precision matrix
by computing the innovations from the process values, but also
integrates seamlessly with other ngme2
components.
Factorizing the AR(1) process in this way, one gains both conceptual
clarity and simplified implementation, especially in high-dimensional
settings with sparse precision matrices. This approach is
computationally efficient and faster than traditional methods. This is
the general structure of the latent models supported by the
ngme2
package.
ngme2
To specify an AR(1) model in ngme2
, use the
f()
function with the model="ar1"
argument
within your model formula: f(time_index, model="ar1")
. When
not specifying the noise
argument in f()
, the
process defaults to Gaussian noise. For simulation purposes, it is
important to define the sigma
parameter in the
noise
specification, as this controls the variance of the
innovations. For estimation, however, specifying sigma
is
optional since the model will automatically estimate this parameter from
the observed data.
When specifying the time grid, keep in mind that the AR(1) process in
ngme2
requires integer time indices, although these do not
need to be consecutive (gaps are allowed). This flexibility is
particularly useful for modeling time series with irregular observation
patterns or missing data points.
For illustration, we will create an AR(1) model using an irregular
time grid with integer years. This demonstrates how ngme2
handles non-consecutive time points:
library(ngme2)
#> This is ngme2 of version 0.7.0
#> - See our homepage: https://davidbolin.github.io/ngme2 for more details.
#>
#> Attaching package: 'ngme2'
#> The following object is masked from 'package:stats':
#>
#> ar
set.seed(16)
# Define time points with gaps (irregular spacing)
time_points <- c(2001, 2003, 2005, 2007)
Observe that in ngme2
, it is possible to have gaps in
the time grid, and the observed years can be provided in any order. For
example, in this case, data for 2002, 2004, and 2006 are missing. The
AR(1) model will still maintain the temporal dependence structure across
these gaps. Internally, ngme2
automatically constructs the
sorted, complete time grid (e.g., 2001, 2002, โฆ, 2007) as needed for the
precision matrix, so the user does not need to handle this manually.
Consequently, the precision matrix will always have dimensions
determined by the complete time grid, while an observation matrix A is
used to map the observed time points to the corresponding points on the
complete grid.
Now we will specify the AR(1) model with a negative autocorrelation parameter, which creates an oscillating pattern in which high values tend to be followed by low values:
# Define AR(1) model with negative autocorrelation
ar1_model <- f(time_points,
model = "ar1",
rho = -0.5, # Negative correlation creates oscillating behavior
noise = noise_normal(sigma = 1.0) # Gaussian innovations with sigma = 1
)
The ngme2
package constructs several important matrices
internally. We can examine the matrix
structure that relates the AR(1) process to the independent noise
terms:
# Examine the K matrix structure
K_matrix <- ar1_model$operator$K
print(K_matrix)
#> 7 x 7 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 0.8660254 . . . . . .
#> [2,] 0.5000000 1.0 . . . . .
#> [3,] . 0.5 1.0 . . . .
#> [4,] . . 0.5 1.0 . . .
#> [5,] . . . 0.5 1.0 . .
#> [6,] . . . . 0.5 1.0 .
#> [7,] . . . . . 0.5 1
The matrix shows the characteristic AR(1) structure:
The mapping matrix tells us how the internal time grid indices correspond to our specified time points:
# Check the mapping matrix A
mapping_matrix <- ar1_model$A
print(mapping_matrix)
#> 4 x 7 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 1 0 . . . . .
#> [2,] . . 1 0 . . .
#> [3,] . . . . 1 0 .
#> [4,] . . . . . 0 1
The mapping matrix
serves as a crucial bridge between the observed time indices, which were
provided by the user, and the internal computational structure of the
AR(1) model. When working with consecutive time indices, this matrix
typically reduces to an identity matrix, providing a direct one-to-one
correspondence. However, its true utility becomes apparent when dealing
with irregular time grids, where it elegantly handles the mapping
between the observed, user-specified, time points and the complete time
grid used internally which is required for efficient computation. This
mapping mechanism allows ngme2
to seamlessly accommodate
missing observations, irregular sampling intervals, and complex temporal
patterns without compromising the fundamental AR(1) structure.
# Load Matrix package for sparse matrix operations
library(Matrix)
# The precision matrix is computed as Q = (K^T K) / sigma^2
sigma2 <- 1.0 # Error variance from the model definition
Q_matrix <- (t(K_matrix) %*% K_matrix) / sigma2
print(Q_matrix)
#> 7 x 7 sparse Matrix of class "dgCMatrix"
#>
#> [1,] 1.0 0.50 . . . . .
#> [2,] 0.5 1.25 0.50 . . . .
#> [3,] . 0.50 1.25 0.50 . . .
#> [4,] . . 0.50 1.25 0.50 . .
#> [5,] . . . 0.50 1.25 0.50 .
#> [6,] . . . . 0.50 1.25 0.5
#> [7,] . . . . . 0.50 1.0
This matrix formulation provides several computational benefits, particularly in terms of memory efficiency and processing speed. Let us examine the sparsity of the precision matrix in the illustration above where we have 7 time grid points (year spanning from 2001 to 2007) associated to our 4 irregularly spaced observed time points:
# Load required libraries
library(ggplot2)
library(reshape2)
library(viridis)
#> Loading required package: viridisLite
# Convert the precision matrix to long format for ggplot2
Q_melted <- melt(as.matrix(Q_matrix))
names(Q_melted) <- c("Row", "Column", "Value")
# Create a binary sparsity indicator
Q_melted$Sparsity <- ifelse(abs(Q_melted$Value) < 1e-10, "Zero", "Non-zero")
# Simple binary sparsity pattern
p_sparsity_simple <- ggplot(Q_melted, aes(x = Column, y = rev(Row), fill = Sparsity)) +
geom_tile(color = "white", linewidth = 0.1) +
scale_fill_manual(values = c("Zero" = "white", "Non-zero" = "#2E86AB")) +
scale_x_continuous(breaks = 1:7, expand = c(0, 0)) +
scale_y_continuous(breaks = 1:7, labels = 7:1, expand = c(0, 0)) +
labs(
title = "Sparsity Pattern of AR(1) Precision Matrix",
x = "Column Index",
y = "Row Index",
fill = "Elements"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 10, face = "bold"),
axis.text = element_text(size = 7),
axis.title = element_text(size = 8),
legend.position = "right",
legend.title = element_text(size = 8),
legend.text = element_text(size = 7),
panel.grid = element_blank(),
aspect.ratio = 1
) +
coord_fixed()
print(p_sparsity_simple)
Now let us examine the sparsity pattern of the precision matrix, which is crucial for computational efficiency:
# Check sparsity (important for computational efficiency)
Q_dense <- as.matrix(Q_matrix)
total_elements <- nrow(Q_matrix) * ncol(Q_matrix)
zero_elements <- sum(abs(Q_dense) < 1e-10)
sparsity <- zero_elements / total_elements
print(paste("Sparsity of precision matrix:", round(sparsity * 100, 1), "%"))
#> [1] "Sparsity of precision matrix: 61.2 %"
The resulting precision matrix maintains its characteristic tri-diagonal structure (with modification in the first row due to the stationarity condition), demonstrating that the temporal gaps do not compromise the fundamental sparsity properties that make AR(1) models computationally tractable. This sparsity is particularly valuable when scaling to longer time series, as it ensures that computational complexity grows linearly rather than quadratically with the number of observations.
Observe that with only 7 time grid pointsโinduced by our four
observed time pointsโthe resulting precision matrix already exhibits
substantial sparsity. As the number of points in the time grid
increases, this sparsity becomes even more pronounced, which greatly
enhances computational efficiency. This scaling property makes AR(1)
models in ngme2
particularly well-suited for long time
series applications.
Finally, let us demonstrate how different error variances affect the precision matrix:
# Demonstrate how different error variances affect the precision matrix
sigma2_values <- c(0.5, 2.0, 4.0)
for(sigma2_test in sigma2_values) {
Q_scaled <- (t(K_matrix) %*% K_matrix) / sigma2_test
print(paste("With sigma^2 =", sigma2_test,
", Q[1,1] =", round(Q_scaled[1,1], 3),
"(scaled by factor 1/", sigma2_test, ")"))
}
#> [1] "With sigma^2 = 0.5 , Q[1,1] = 2 (scaled by factor 1/ 0.5 )"
#> [1] "With sigma^2 = 2 , Q[1,1] = 0.5 (scaled by factor 1/ 2 )"
#> [1] "With sigma^2 = 4 , Q[1,1] = 0.25 (scaled by factor 1/ 4 )"
The resulting precision matrix
is sparse, which is computationally advantageous for large time series.
This sparsity reflects the Markov property of AR(1) processes: each
observation depends only on its immediate neighbors, not on distant past
values. The scaling by
ensures that the precision matrix correctly represents the inverse
covariance structure, with larger error variances resulting in lower
precision (higher uncertainty) and vice versa. When defining AR(1)
models in practice, it is important to specify the error variance
explicitly (e.g., noise_normal(sigma = 1.0)
) to maintain
consistency between the theoretical formulation and computational
implementation.
Understanding AR(1) model behavior through simulation helps build
intuition about temporal dependencies and the impact of different noise
distributions. The ngme2
package provides straightforward
simulation capabilities that allow us to explore these characteristics
systematically.
First, let us prepare our environment and define the basic parameters for our simulations:
We begin with the traditional Gaussian AR(1) model, which assumes
normally distributed innovations. This serves as our baseline for
comparison. First, we create the AR(1) model object using the
f()
function, which specifies the model structure,
parameters, and noise distribution. The f()
function
returns a model object that contains all the necessary information for
simulation and estimation, including the temporal dependence structure
(ฯ = 0.7) and the Gaussian noise specification (ฯ = 2.0).
# Define AR(1) model with Gaussian noise
ar1_gaussian <- f(time_index,
model = "ar1",
rho = 0.7,
noise = noise_normal(sigma = 2.0)
)
In the second step, we obtain sample realizations from this model
using the simulate()
method, which generates random samples
according to the specified AR(1) process. The simulate()
method takes the model object as input along with optional arguments
such as the random seed (seed = 456
) for reproducibility
and the number of simulations (nsim = 1
). This two-step
approach separates model specification from sample generation, providing
flexibility for multiple simulations or parameter exploration.
# Generate realization from the AR(1) model
W_gaussian <- simulate(ar1_gaussian, seed = 456, nsim = 1)[[1]]
Let us visualize the simulated Gaussian AR(1) process and examine its temporal structure. First, let us calculate the autocorrelation and partial autocorrelation functions for the simulated Gaussian AR(1) process. These functions are essential diagnostic tools that help us verify the temporal structure of our model:
# Prepare data for ggplot
gaussian_data <- data.frame(
time = time_index,
value = W_gaussian
)
# Calculate autocorrelation function
acf_result <- acf(W_gaussian, lag.max = 20, plot = FALSE)
acf_data <- data.frame(
lag = 0:20,
acf = as.numeric(acf_result$acf)
)
# Calculate partial autocorrelation function
pacf_result <- pacf(W_gaussian, lag.max = 20, plot = FALSE)
pacf_data <- data.frame(
lag = 1:20,
pacf = as.numeric(pacf_result$acf)
)
Now we will visualize the simulated process and its correlation structure. The time series plot shows the temporal evolution of the process, while the ACF and PACF plots reveal the characteristic patterns that confirm our AR(1) model specification:
# Create time series plot
p1 <- ggplot(gaussian_data, aes(x = time, y = value)) +
geom_line(color = "#2E86AB", alpha = 0.8, linewidth = 0.3) +
labs(
title = "AR(1) Process with Gaussian Noise",
subtitle = expression(paste(rho, " = 0.7, ", sigma, " = 2.0")),
x = "Time",
y = "Value"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
plot.subtitle = element_text(size = 10),
panel.grid.minor = element_blank()
)
# Plot autocorrelation function
p2 <- ggplot(acf_data, aes(x = lag, y = acf)) +
geom_hline(yintercept = 0, color = "gray50") +
geom_hline(yintercept = c(-0.2, 0.2), color = "blue", linetype = "dashed", alpha = 0.4, linewidth = 0.4) +
geom_segment(aes(xend = lag, yend = 0), color = "#2E86AB", linewidth = 0.4) +
geom_point(color = "#2E86AB", size = 1) +
labs(
title = "Sample Autocorrelation Function",
x = "Lag",
y = "ACF"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
panel.grid.minor = element_blank()
)
# Plot partial autocorrelation function
p3 <- ggplot(pacf_data, aes(x = lag, y = pacf)) +
geom_hline(yintercept = 0, color = "gray50") +
geom_hline(yintercept = c(-0.2, 0.2), color = "blue", linetype = "dashed", alpha = 0.4, linewidth = 0.4) +
geom_segment(aes(xend = lag, yend = 0), color = "#2E86AB", linewidth = 0.4) +
geom_point(color = "#2E86AB", size = 1) +
labs(
title = "Sample Partial Autocorrelation Function (PACF)",
x = "Lag",
y = "PACF"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
panel.grid.minor = element_blank()
)
# Combine plots
grid.arrange(p1, p2, p3, ncol = 1)
The autocorrelation function shows the expected exponential decay pattern characteristic of AR(1) processes, while the partial autocorrelation function exhibits the diagnostic cutoff after lag 1, confirming the model temporal structure. The PACF pattern is particularly useful for identifying the order of autoregressive processes.
Now let us explore the flexibility of non-Gaussian innovations using
the Normal Inverse Gaussian (NIG) distribution, which can capture
asymmetry and heavy tails. Here we create an AR(1) model object with NIG
innovations using the f()
function. The NIG distribution is
specified through noise_nig()
with three parameters:
mu = -3
(location parameter introducing asymmetry),
sigma = 4
(scale parameter controlling spread), and
nu = 0.4
(shape parameter governing tail heaviness).
# Define AR(1) model with NIG noise
ar1_nig <- f(time_index,
model = "ar1",
rho = 0.5,
noise = noise_nig(mu = -3, sigma = 4, nu = 0.4)
)
We then generate a sample realization using the
simulate()
method with a different random seed
(seed = 16
) to ensure independent samples from our Gaussian
example. The resulting time series will exhibit the characteristic
features of NIG innovations while maintaining the AR(1) temporal
correlation structure specified by
.
# Generate realization from the NIG AR(1) model
W_nig <- simulate(ar1_nig, seed = 16, nsim = 1)[[1]]
We will now perform the same correlation analysis for the NIG AR(1) process. This allows us to compare how the non-Gaussian innovation distribution affects the temporal correlation structure compared to the Gaussian case:
# Prepare data for NIG process
nig_data <- data.frame(
time = time_index,
value = W_nig
)
# Calculate ACF for NIG process
acf_nig <- acf(W_nig, lag.max = 20, plot = FALSE)
acf_nig_data <- data.frame(
lag = 0:20,
acf = as.numeric(acf_nig$acf)
)
# Calculate and plot partial autocorrelation function for NIG process
pacf_nig <- pacf(W_nig, lag.max = 20, plot = FALSE)
pacf_nig_data <- data.frame(
lag = 1:20,
pacf = as.numeric(pacf_nig$acf)
)
Next, we visualize the NIG AR(1) process and its correlation functions. These plots will demonstrate that while the innovation distribution is non-Gaussian, the temporal correlation structure remains consistent with AR(1) theory, showing the robustness of the autoregressive framework across different noise distributions:
# Create time series plot for NIG
p4 <- ggplot(nig_data, aes(x = time, y = value)) +
geom_line(color = "#A23B72", alpha = 0.8, linewidth = 0.3) +
labs(
title = "AR(1) Process with NIG Noise",
subtitle = expression(paste(rho, " = 0.5, NIG(", mu, " = -3, ", sigma, " = 4, ", nu, " = 0.4)")),
x = "Time",
y = "Value"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
plot.subtitle = element_text(size = 10),
panel.grid.minor = element_blank()
)
# Plot autocorrelation function for NIG process
p5 <- ggplot(acf_nig_data, aes(x = lag, y = acf)) +
geom_hline(yintercept = 0, color = "gray50") +
geom_hline(yintercept = c(-0.2, 0.2), color = "blue", linetype = "dashed", alpha = 0.4, linewidth = 0.4) +
geom_segment(aes(xend = lag, yend = 0), color = "#A23B72", linewidth = 0.4) +
geom_point(color = "#A23B72", size = 1) +
labs(
title = "Sample Autocorrelation Function (NIG)",
x = "Lag",
y = "ACF"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
panel.grid.minor = element_blank()
)
# Plot partial autocorrelation function for NIG process
p6 <- ggplot(pacf_nig_data, aes(x = lag, y = pacf)) +
geom_hline(yintercept = 0, color = "gray50") +
geom_hline(yintercept = c(-0.2, 0.2), color = "blue", linetype = "dashed", alpha = 0.4, linewidth = 0.4) +
geom_segment(aes(xend = lag, yend = 0), color = "#A23B72", linewidth = 0.4) +
geom_point(color = "#A23B72", size = 1) +
labs(
title = "Sample Partial Autocorrelation Function (PACF)",
x = "Lag",
y = "PACF"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
panel.grid.minor = element_blank()
)
# Combine plots
grid.arrange(p4, p5, p6, ncol = 1)
The NIG AR(1) process exhibits similar temporal correlation structure to the Gaussian case, as evidenced by the exponential decay in the ACF and the cutoff after lag 1 in the PACF. However, the non-Gaussian innovations introduce different marginal behavior, allowing for asymmetric and heavy-tailed distributions while preserving the fundamental AR(1) temporal dynamics.
A key advantage of the NIG distribution is its ability to model asymmetric and heavy-tailed behavior. To better understand the differences between Gaussian and NIG innovations, we will prepare the data for a direct comparison of the marginal distributions. This analysis will highlight how the choice of noise distribution affects the overall behavior of the AR(1) process:
# Combine data for comparison
comparison_data <- data.frame(
value = c(W_gaussian, W_nig),
type = rep(c("Gaussian", "NIG"), each = n_obs)
)
Now we create comparative visualizations to examine the distributional differences between the two AR(1) processes. The histogram shows the overall shape and asymmetry, while the box plots highlight differences in central tendency, spread, and the presence of outliers:
# Create comparison plots
p7 <- ggplot(comparison_data, aes(x = value, fill = type)) +
geom_histogram(alpha = 0.6, bins = 50, position = "identity") +
scale_fill_manual(values = c("Gaussian" = "#2E86AB", "NIG" = "#A23B72")) +
labs(
title = "Distribution Comparison",
x = "Value",
y = "Frequency",
fill = "Noise Type"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
legend.position = "top"
)
# Box plot comparison
p8 <- ggplot(comparison_data, aes(x = type, y = value, fill = type)) +
geom_boxplot(alpha = 0.6, outlier.alpha = 0.5) +
scale_fill_manual(values = c("Gaussian" = "#2E86AB", "NIG" = "#A23B72")) +
labs(
title = "Distribution Comparison (Box Plots)",
x = "Noise Type",
y = "Value"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 12, face = "bold"),
legend.position = "none"
)
grid.arrange(p7, p8, ncol = 2)
The comparison reveals how the NIG distribution captures different tail
behavior and potential asymmetry compared to the Gaussian case.
Model estimation in ngme2
involves fitting AR(1) models
to observed data using advanced optimization techniques. Weโll
demonstrate this process using our simulated NIG data combined with
fixed effects and measurement noise.
Let us create a realistic dataset with fixed effects, latent AR(1) structure, and measurement noise:
# Define fixed effects and covariates
feff <- c(-1, 2, 0.5)
x1 <- runif(n_obs)
x2 <- rexp(n_obs)
x3 <- rnorm(n_obs, mean = 1, sd = 0.5)
X <- model.matrix(~ 0 + x1 + x2 + x3)
# Create response variable with fixed effects, latent process, and measurement noise
Y <- as.numeric(X %*% feff) + W_nig + rnorm(n_obs, sd = 2)
# Prepare data frame
model_data <- data.frame(x1 = x1, x2 = x2, x3 = x3, Y = Y)
We specify our model using the ngme
function,
incorporating both fixed effects and the latent AR(1) process. The
ngme()
function follows Rโs standard modeling approach by
accepting a formula object that defines the relationship between the
response variable and both fixed and latent effects:
# Fit the model
ngme_out <- ngme(
Y ~ 0 + x1 + x2 + x3 + f(
1:n_obs,
name = "my_ar",
model = "ar1",
noise = noise_nig(),
control = control_f(numer_grad = TRUE, improve_hessian = FALSE)
),
data = model_data,
control_opt = control_opt(
optimizer = adam(),
burnin = 100,
iterations = 1000,
std_lim = 0.01,
n_parallel_chain = 4,
stop_points = 10,
verbose = FALSE,
print_check_info = FALSE,
seed = 3
)
)
The formula Y ~ 0 + x1 + x2 + x3 + f(...)
specifies that
the response variable Y
depends on three fixed effects
(x1
, x2
, x3
) with no intercept
(the 0
term), plus a latent AR(1) process defined through
the f()
function. The latent model is specified with
f(1:n_obs, name = "my_ar", model = "ar1", noise = noise_nig())
,
where we assign a name to the process for later reference and specify
NIG innovations without fixing the parameters (allowing them to be
estimated).
The control_opt()
function manages the optimization
process through several key parameters:
optimizer = adam()
: Uses the Adam
optimizer, an adaptive gradient-based algorithm well-suited for complex
likelihood surfacesiterations = 1000
: Sets the maximum
number of optimization iterationsburnin = 100
: Specifies initial
iterations for algorithm stabilization before convergence
assessmentn_parallel_chain = 4
: Runs 4 parallel
optimization chains to improve convergence reliability and parameter
estimation robustnessstd_lim = 0.01
: Sets the convergence
criterion based on parameter standard deviation across chainsstop_points = 10
: Number of
intermediate convergence checks during optimizationseed = 3
: Ensures reproducible results
across runsThis optimization setup balances computational efficiency with estimation reliability, particularly important for complex models with non-Gaussian latent processes. Let us examine the fitted model results:
# Display model summary
ngme_out
#> *** Ngme object ***
#>
#> Fixed effects:
#> x1 x2 x3
#> -0.928 2.043 0.519
#>
#> Models:
#> $my_ar
#> Model type: AR(1)
#> rho = 0.483
#> Noise type: NIG
#> Noise parameters:
#> mu = -2.87
#> sigma = 3.89
#> nu = 0.407
#>
#> Measurement noise:
#> Noise type: NORMAL
#> Noise parameters:
#> sigma = 2.03
Assessing convergence is crucial for reliable parameter estimates. We
will use the built-in traceplot
function from
ngme2
to examine the convergence of our parameters:
# Trace plot for sigma and fixed effects (blue lines indicate true value)
traceplot(ngme_out, hline = c(2,-1, 2, 0.5), moving_window = 20)
#> Last estimates:
#> $sigma
#> [1] 2.016504
#>
#> $`fixed effect 1`
#> [1] -0.9287274
#>
#> $`fixed effect 2`
#> [1] 2.042297
#>
#> $`fixed effect 3`
#> [1] 0.5200857
Now let us examine the convergence of the AR(1) process parameters:
# Trace plots for AR(1) parameters (blue lines indicate true value)
traceplot(ngme_out, "my_ar", hline = c(0.5, -3, 4, 0.4))
#> Last estimates:
#> $rho
#> [1] 0.4833267
#>
#> $mu
#> [1] -2.871614
#>
#> $sigma
#> [1] 3.893367
#>
#> $nu
#> [1] 0.4076381
Let us examine the final parameter estimates and compare them with
the true values using the ngme_result()
function:
# Parameter Estimates for Fixed Effects and Measurement Noise
params_data <- ngme_result(ngme_out, "data")
# Fixed Effects
print(data.frame(
Truth = c(-1, 2, 0.5),
Estimates = as.numeric(params_data$fixed_effects),
row.names = names(params_data$fixed_effects)
))
#> Truth Estimates
#> x1 -1.0 -0.9276886
#> x2 2.0 2.0425700
#> x3 0.5 0.5188674
# Measurement Noise
print(data.frame(Truth = 2.0, Estimate = params_data$sigma, row.names = "sigma"))
#> Truth Estimate
#> sigma 2 2.027015
cat("\n")
# AR(1) Process Parameters
params_ar1 <- ngme_result(ngme_out, "my_ar")
print(data.frame(
Truth = c(rho = 0.5, mu = -3.0, sigma = 4.0, nu = 0.4),
Estimates = unlist(params_ar1)
))
#> Truth Estimates
#> rho 0.5 0.4832092
#> mu -3.0 -2.8699542
#> sigma 4.0 3.8940500
#> nu 0.4 0.4069202
Finally, let us validate our model by comparing the estimated noise distribution with the true distribution by plotting the estimated noise distribution against the true noise distribution:
# Distribution Validation: Fitted vs. True NIG Noise Model
with(params_ar1, {
plot(
noise_nig(mu = mu, sigma = sigma, nu = nu),
noise_nig(mu = -3, sigma = 4, nu = 0.4)
)
})
The estimation results demonstrate the effectiveness of the
ngme2
framework for fitting complex AR(1) models with
non-Gaussian innovations. Key findings include:
This validation confirms that the estimation procedure successfully recovered the underlying model parameters, providing confidence in the methodology for real-world applications where the true parameters are unknown.