These are configurations for ngme optimization process.

control_opt(
  seed = Sys.time(),
  burnin = 100,
  iterations = 500,
  estimation = TRUE,
  standardize_fixed = TRUE,
  stop_points = 10,
  iters_per_check = iterations/stop_points,
  optimizer = adam(),
  n_parallel_chain = 4,
  max_num_threads = n_parallel_chain,
  exchange_VW = TRUE,
  n_slope_check = 3,
  std_lim = 0.01,
  trend_lim = 0.01,
  print_check_info = FALSE,
  max_relative_step = 0.5,
  max_absolute_step = 0.5,
  converge_eps = 1e-05,
  rao_blackwellization = FALSE,
  n_trace_iter = 10,
  sampling_strategy = "all",
  verbose = FALSE
)

Arguments

seed

set the seed for pesudo random number generator

burnin

interations for burn-in periods (before optimization)

iterations

optimization iterations

estimation

run the estimation process (call C++ in backend)

standardize_fixed

whether or not standardize the fixed effect

stop_points

number of stop points for convergence check (or specify iters_per_check)

iters_per_check

run how many iterations between each check point (or specify stop_points)

optimizer

choose different sgd optimization method, currently support "precond_sgd", "momentum", "adagrad", "rmsprop", "adam", "adamW" see precond_sgd, ?momentum, ?adagrad, ?rmsprop, ?adam, ?adamW

n_parallel_chain

number of parallel chains

max_num_threads

maximum number of threads used for parallel computing, by default will be set same as n_parallel_chain. If it is more than n_parallel_chain, the rest will be used to parallel different replicates of the model.

exchange_VW

exchange last V and W in each chian

n_slope_check

number of stop points for regression

std_lim

maximum allowed standard deviation

trend_lim

maximum allowed slope

print_check_info

print the convergence information

max_relative_step

max relative step allowed in 1 iteration

max_absolute_step

max absolute step allowed in 1 iteration

converge_eps

convergence threshold, test if grad.norm() < converge_eps

rao_blackwellization

use rao_blackwellization

n_trace_iter

use how many iterations to approximate the trace (Hutchinson’s trick)

sampling_strategy

subsampling method of replicates of model, c("all", "is") "all" means using all replicates in each iteration, "ws" means weighted sampling (each iteration use 1 replicate to compute the gradient, the sample probability is proption to its number of observations)

verbose

print estimation

Value

list of control variables

Details

To enable convergence check, we need multiple chains running. We compare the trend of the estimated parameter of length n_slope_check (linear regression) with trend_lim. We compare the standard devation of estimated parameters (in different chains) with std_lim.