Using SMC · SMC.jl

SMC Main Function

You can use SMC to estimate any Bayesian model. This requires (1) parameters and their associated prior distributions (2) data (3) a log-likelihood function. These three ingredients are the only inputs into the smc driver.

smc(loglikelihood::Function, parameters::ParameterVector, data::Matrix) function.

Let's look at an example. First, we'll make a model. We need the ModelConstructors package to do that.

using ModelConstructors
ols_model = GenericModel()

Model 
no. parameters:             0
description:
 A generic model: You can put parameters, states, shocks, observables, pseudo-observables, spec, subpsec, settings inside of me.

Next, assign our intercept and coefficient parameters to the model.

reg <= parameter(:α1, 0., (-1e5, 1e5), (-1e5, 1e5), Untransformed(), Normal(0, 10), fixed = false)
reg <= parameter(:β1, 0., (-1e5, 1e5), (-1e5, 1e5), Untransformed(), Normal(0, 10), fixed = false)

The first argument is the name of the parameter, the second is its initial value, the third and fourther are bounds on the parameter (they should always be the same). If the sampler draws a value outside of these bounds, it'll try again. The fifth argument is whether you would like to perform a transformation on the parameter when using it (ignore this for now–it's for more advanced users). The sixth argument is the prior, $N(0, 10)$, and last argument says the parameters aren't fixed parameters (i.e. we're trying to estimate them!)

We'll make some artifical data to test our model

X = rand(100) #hide
β = 1.
α = 1.
y = β*X + α

We need a log-likelihood function! Note that it's important you define a log-likelihood function rather than a likelihood function.

function likelihood_fnct(p, d)
    α = p[1]
    β = p[2]
    Σ = 1
    det_Σ = det(Σ)
    inv_Σ = inv(Σ)
    term1 = -N / 2 * log(2 * π) - 1 /2 * log(det_Σ)
    logprob = 0.
    errors = d[:, 1] .- α .- β .* d[:, 2]
    for t in 1:size(d,1)
        logprob += term1 - 1/2 * dot(errors, inv_Σ * errors)
    end
    return logprob
end

likelihood_fnct (generic function with 1 method)

And that's it! Now let's run SMC.

smc(likelihood_fnct, reg.parameters, data, n_parts = 10, use_fixed_schedule = false, tempering_target = 0.97)

SMC.smc — Method

function smc(loglikelihood::Function, parameters::ParameterVector{U}, data::Matrix{S};
             kwargs...) where {S<:AbstractFloat, U<:Number}

Arguments:

loglikelihood::Function: Log-likelihood function of model being estimated. Takes parameters and data as arguments.
parameters::ParameterVector{U}: Model parameter vector, which stores parameter values, prior dists, and bounds.
data: A matrix or dataframe containing the time series of the observables used in the calculation of the posterior/loglikelihood

Keyword Arguments:

verbose::Symbol: Desired frequency of function progress messages printed to standard out.

- `:none`: No status updates will be reported.
- `:low`: Status updates for SMC initialization and recursion will be included (default).
- `:high`: Status updates for every iteration of SMC is output, which includes
the mean and standard deviation of each parameter draw after each iteration,
as well as calculated acceptance rate, ESS, and number of times resampled.

parallel::Bool = false: Flag for running algorithm in parallel.
n_parts::Int = 5_000: Number of particles.
n_blocks::Int = 1: Number of parameter blocks in mutation step.
n_mh_steps::Int = 1: Number of Metropolis Hastings steps to attempt during the mutation step.
λ::S = 2.1: The 'bending coefficient' λ in Φ(n) = (n/N(Φ))^λ
n_Φ::Int = 300: Number of stages in the tempering schedule.
resampling_method::Symbol: Which resampling method to use.
- :systematic: Will use systematic resampling (default).
- :multinomial: Will use multinomial resampling.
- :polyalgo: Samples using a polyalgorithm.
threshold_ratio::S = 0.5: Threshold s.t. particles will be resampled when the population drops below threshold * N.
c::S = 0.5: Initial scaling factor for covariance of the particles. Controls size of steps in mutation step. This value will be adaptively set afterward to reach an accept rate of target (see kwarg below).
α::S = 1.0: The mixture proportion for the mutation step's proposal distribution. See ?mvnormal_mixture_draw for details. Note that a value of 0.9 has commonly been used in applications to DSGE models (see citations below).
target::S = 0.25: The target acceptance rate for new particles during mutation.
use_fixed_schedule::Bool = true: Flag for whether or not to use a fixed tempering (ϕ) schedule.
tempering_target::S = 0.97: Coefficient of the sample size metric to be targeted when solving for an endogenous ϕ.
old_data::Matrix{S} = []: A matrix containing the time series of observables of previous data (with data being the new data) for the purposes of a time tempered estimation (that is, using the posterior draws from a previous estimation as the initial set of draws for an estimation with new data). This matrix is used to compute the log-likelihood according to the old_loglikelihood function when particles change during tempering.
old_cloud::Cloud = Cloud(0, 0): associated cloud borne of old data in previous SMC estimation. Running a bridge estimation requires old_data and old_cloud. If no old_cloud is provided, then we will attempt to load one using loadpath.
old_loglikelihood::Function = loglikelihood: the old log-likelihood function when running a time tempered estimation. This function should be able to evaluate the log-likelihood of old_data given the current parameters, which may be nontrivial if the current parameters include additional parameters relative to the old estimation. By default, we assume the log-likelihood function has not changed and therefore coincides with the current one, so that the log-likelihood of old_data can be computed as loglikelihood(parameters, old_data)
old_vintage::String = "": String for vintage date of old data
smc_iteration::Int = 1: The iteration index for the number of times SMC has been run on the same data vintage. Primarily for numerical accuracy/testing purposes.
run_test::Bool = false: Flag for when testing accuracy of program
filestring_addl::Vector{String} = []: Additional file string extension for loading old cloud.
save_intermediate::Bool = false: Flag for whether one wants to save intermediate Cloud objects
intermediate_stage_increment::Int: Save Clouds at every increment (1 = each stage, 10 = every 10th stage, etc.). Useful if you are using a cluster with time limits because if you hit the time limit, then you can just start from an intermediate stage rather than start over.
continue_intermediate::Bool = false: Flag to indicate whether one is continuing SMC from an intermediate stage.
intermediate_stage_start::Int = 10: Intermediate stage at which one wishes to begin the estimation.
tempered_update_prior_weight::Float64 = 0.0: Weight placed on the current priors of parameters to construct a convex combination of draws from current priors and the previous estimation's cloud. The convex combination serves as the bridge distribution for a time tempered estimation.
run_csminwel::Bool = true: Flag to run the csminwel algorithm to identify the true posterior mode (which may not exist) after completing an estimation. The mode identified by SMC is just the particle with the highest posterior value, but we do not check it is actually a mode (i.e. the Hessian is negative definite).
regime_switching::Bool = false: Flag if there are regime-switching parameters. Otherwise, not all the values of the regimes will be used or saved.
toggle::Bool = true: Flag for resetting the fields of parameter values to regime 1 anytime the loglikelihood is computed. The regime-switching version of SMC assumes at various points that this resetting occurs. If speed is important, then ensure that the fields of parameters take their regime 1 values at the end of the loglikelihood computation and set toggle = false.
debug_assertion::Bool = false: if true, then when an assertion error is thrown during the estimation, output is created in a JLD2 file to help the user debug the problem.
log_prob_old_data::Float64 = 0.0: Log MDD of old data for correct incremental weights when bridging

Outputs

cloud: The Cloud object containing all of the information about the parameter values from the sample, their respective log-likelihoods, the ESS schedule, tempering schedule etc., which is saved in the saveroot.

Overview

Sequential Monte Carlo can be used in lieu of Random Walk Metropolis Hastings to generate parameter samples from high-dimensional parameter spaces using sequentially constructed proposal densities to be used in iterative importance sampling.

This implementation is based on Edward Herbst and Frank Schorfheide's 2014 paper 'Sequential Monte Carlo Sampling for DSGE Models' and the code accompanying their book 'Bayesian Estimation of DSGE Models'. Cai et al. (2021) 'Online Estimation of DSGE Models' extend the algorithm in Herbst and Schorfheide (2014) to include an adaptive schedule and generalized tempering, which are both features of this package.

SMC is broken up into three main steps:

Correction: Reweight the particles from stage n-1 by defining incremental weights, which gradually "temper in" the loglikelihood function $p(Y ert \theta)^(\phi_n - \phi_n-1)$ into the normalized particle weights.
Selection: Resample the particles if the distribution of particles begins to degenerate, according to a tolerance level for the ESS.
Mutation: Propagate particles ${\theta(i), W(n)}$ via a Metropolis Hastings algorithm (the number of steps are specified by n_mh_steps).

source