Skip to contents

Overview

This vignette introduces the SGLD workflow in ngme2:

  1. fit a model with optimizer = sgld(...) and trajectory storage;
  2. discard early iterations (burnin_iter);
  3. thin the remaining samples (thinning);
  4. extract posterior-like parameter samples as a data frame via ngme_sgld_samples();
  5. compute quantile credible intervals via ngme_sgld_ci().

SGLD update used in ngme2

SGLD extends SGD with Gaussian noise:

θt+1=θtηtU(θt)+2Tηtξt,ξt𝒩(0,I). \theta_{t+1} = \theta_t - \eta_t \nabla U(\theta_t) + \sqrt{2 T \eta_t}\,\xi_t,\quad \xi_t \sim \mathcal{N}(0, I).

  • stepsize in sgld(stepsize = ...) is the base η0\eta_0.
  • temperature in sgld(temperature = ...) is TT.
  • stepsize_control determines how ηt\eta_t decays.

For asymptotic SGLD theory, use diminishing stepsize (for example poly_decay(alpha, t0)). If you keep constant stepsize, control_opt() emits a warning.

# Triggers warning: constant stepsize without decay
try(control_opt(optimizer = sgld(stepsize = 0.01, temperature = 1)))
#> Error in control_opt(optimizer = sgld(stepsize = 0.01, temperature = 1)) : 
#>   could not find function "control_opt"

Basic configuration

library(ngme2)
#> This is ngme2 of version 0.9.1
#> - See our homepage: https://davidbolin.github.io/ngme2 for more details.
#> 
#> Attaching package: 'ngme2'
#> The following object is masked from 'package:stats':
#> 
#>     ar

opt <- sgld(stepsize = 0.01, temperature = 1)
sc <- poly_decay(alpha = 0.6, t0 = 10)

ctl <- control_opt(
  optimizer = opt,
  stepsize_control = sc,
  burnin = 100,
  iterations = 4000,
  n_batch = 20,
  n_parallel_chain = 2,
  store_traj = TRUE,
  # fixed-iteration SGLD runs are usually preferred for sampling workflows
  trend_std_conv_check = FALSE,
  R_hat_conv_check = FALSE,
  pflug_conv_check = FALSE
)

End-to-end example

library(ngme2)
set.seed(2026)

n <- 200
idx <- 1:n
y <- as.numeric(arima.sim(model = list(ar = 0.6), n = n)) + rnorm(n, sd = 0.4)
dat <- data.frame(y = y)

fit_sgld <- ngme(
  y ~ 0 + f(idx, name = "ar1_field", model = ar1(), noise = noise_normal()),
  data = dat,
  family = noise_normal(),
  control_opt = control_opt(
    optimizer = sgld(stepsize = 0.01, temperature = 1),
    stepsize_control = poly_decay(alpha = 0.6, t0 = 10),
    burnin = 100,
    iterations = 60000,
    n_batch = 20,
    n_parallel_chain = 2,
    store_traj = TRUE,
    trend_std_conv_check = FALSE,
    R_hat_conv_check = FALSE,
    pflug_conv_check = FALSE
  ),
  control_ngme = control_ngme(n_gibbs_samples = 3)
)

# Extract posterior-like samples for all parameters
samp <- ngme_sgld_samples(
  ngme = fit_sgld,
  name = "all",
  burnin_iter = 1000,
  thinning = 10,
  apply_transform = TRUE,
  combine_chains = TRUE
)

head(samp)

# Quantile CI directly from SGLD samples
ci <- ngme_sgld_ci(
  samples = samp,
  lower = 0.025,
  upper = 0.975
)

ci$ci

The returned data frame contains:

  • .chain: chain index
  • .draw: draw index within chain after burnin+thinning
  • .iter: original iteration index in optimizer trajectory
  • one column per parameter (transformed scale if apply_transform = TRUE)

Parameter block selection

name controls which parameter block to extract:

  • name = "all": all latent + measurement-noise + fixed-effect parameters
  • name = "general": measurement-noise + fixed effects only
  • name = "<latent_name>": one latent model block

Example:

# Measurement-noise + fixed effects only
s_general <- ngme_sgld_samples(
  fit_sgld,
  name = "general",
  burnin_iter = 1000,
  thinning = 10
)

# One latent block only
s_lat <- ngme_sgld_samples(
  fit_sgld,
  name = "ar1_field",
  burnin_iter = 1000,
  thinning = 10
)

One-step wrapper from existing fit

If you already have a fitted ngme object from another optimizer (for example adam), you can warm-start one SGLD stage and directly extract samples:

samp2 <- compute_ngme_sgld_samples(
  fit = fit_prev,
  iterations = 6000,
  optimizer = sgld(stepsize = 0.01, temperature = 1),
  burnin = 100,
  n_batch = 20,
  n_parallel_chain = 2,
  alpha = 0.6,
  t0 = 10,
  burnin_iter = 1000,
  thinning = 10,
  name = "all",
  apply_transform = TRUE
)

Quick posterior summaries

Once you have the sample data frame, posterior summaries are standard data-frame operations:

# Example: posterior mean and 95% credible interval for each parameter
param_cols <- setdiff(colnames(samp), c(".chain", ".draw", ".iter"))

summary_df <- do.call(rbind, lapply(param_cols, function(p) {
  x <- samp[[p]]
  data.frame(
    parameter = p,
    mean = mean(x),
    lower = as.numeric(stats::quantile(x, 0.025)),
    upper = as.numeric(stats::quantile(x, 0.975)),
    row.names = NULL
  )
}))

summary_df

You can also use the built-in helper directly:

ci <- ngme_sgld_ci(
  samples = s_general,
  lower = 0.025,
  upper = 0.975
)

ci$estimates
ci$ci
ci$covariance

Notes

  • SGLD samples are produced in optimizer parameter space trajectories; tuning stepsize, alpha, t0, burnin_iter, and thinning is model-dependent.
  • Use apply_transform = FALSE if you need raw (unconstrained) parameter-space samples for diagnostics.