# Estimation

## Procedure

The goal of the estimation step is to sample from the posterior distribution of the model parameters. DSGE.jl provides two estimation routines. The default is a Metropolis-Hastings sampler to do this, which requires as a proposal covariance matrix and the Hessian matrix corresponding to the posterior mode. The second routine is a Sequential Monte Carlo sampler, which is called from the SMC.jl package. Both routines implement adaptive proposal densities and parameter blocking.^{[1]}

The function `estimate`

implements the entire procedure for either routine. Below, we explain the MH algorithm. For documentation of the SMC algorithm, see here.

**Main Steps**:

*Initialization*: Read in and transform raw data from`save/input_data/`

. See Input Data for more details.*Reoptimize parameter vector*: The main program will call the`csminwel`

optimization routine (located in`csminwel.jl`

) to find modal parameter estimates.*Compute Hessian matrix*: Computing the Hessian matrix to scale the proposal distribution in the Metropolis-Hastings algorithm.*Sample from Posterior*: Posterior sampling is performed using the Metropolis-Hastings algorithm. A proposal distribution is constructed centered at the posterior mode and with proposal covariance scaled by the inverse of the Hessian matrix. Settings for the number of sampling blocks and the size of those blocks can be altered as described in Editing or Extending a Model.

**Remark**: For the MH sampler, in addition to saving each `mh_thin`

-th draw of the parameter vector, the estimation program also saves the resulting posterior value and transition equation matrices implied by each draw of the parameter vector. This is to save time in the forecasting step since that code can avoid recomputing those matrices.

For the SMC sampler, we save a `jld2`

file containing a `Cloud`

object which holds any relevant information about the particles approximating the posterior, the normalized weights of the particles, and the unnormalized weights of particles. We also save a draw of parameters to an `h5`

file.

To run the entire procedure, the user simply calls the `estimate`

routine:

`DSGE.estimate`

— Function`estimate(m, data; verbose=:low, proposal_covariance=Matrix())`

Estimate the DSGE parameter posterior distribution.

**Arguments:**

`m::AbstractDSGEModel`

or`m::AbstractVARModel`

: model object

**Estimation Settings**

Please see the section on 'Estimation Settings' on the 'Advanced Usage' page of the online documentation or src/defaults.jl for a full description of all the estimation settings for both the Metropolis-Hastings (MH) and Sequential Monte Carlo (SMC) algorithms. For the latter, also see `?DSGE.smc2`

. Most of the optional and keyword arguments described below are not directly related to the behavior of the sampling algorithms (e.g. tuning, number of samples).

**Optional Arguments:**

`data`

: well-formed data as`Matrix`

or`DataFrame`

. If this is not provided, the`load_data`

routine will be executed.

**Keyword Arguments:**

`verbose::Symbol`

: The desired frequency of function progress messages printed to standard out.`:none`

: No status updates will be reported.`:low`

(default): Status updates will be provided in csminwel and at each block in Metropolis-Hastings. For SMC's verbosity settings, see`?smc2`

.`:high`

: Status updates provided at each iteration in Metropolis-Hastings.

`proposal_covariance::Matrix = []`

: Used to test the metropolis_hastings algorithm with a precomputed covariance matrix for the proposal distribution. When the Hessian is singular, eigenvectors corresponding to zero eigenvectors are not well defined, so eigenvalue decomposition can cause problems. Passing a precomputed matrix allows us to ensure that the rest of the routine has not broken.`method::Symbol`

: The method to use when sampling from the posterior distribution. Can be either`:MH`

for standard Metropolis Hastings Markov Chain Monte Carlo, or`:SMC`

for Sequential Monte Carlo. This should be specified by the setting`sampling_method`

in`m`

.`mle = false`

: Set to true if parameters should be estimated by maximum likelihood directly. If this is set to true, this function will return after estimating parameters.`sampling = true`

: Set to false to disable sampling from the posterior.`old_data::Matrix{Float64}`

= []: A matrix containing the time series of observables of previous data (with`data`

being the new data) for the purposes of a time tempered estimation (that is, using the posterior draws from a previous estimation as the initial set of draws for an estimation with new data). Running a bridge estimation requires both`old_data`

and`old_cloud`

.`old_cloud::Union{DSGE.ParticleCloud, DSGE.Cloud, SMC.Cloud} = DSGE.Cloud(m, 0)`

: old Cloud object used to describe particles from a previous estimation with old data. Running a bridge estimation requires both`old_data`

and`old_cloud`

. If running a bridge estimation and no`old_cloud`

is provided, then it will be loaded using the filepaths in`m`

. If no`old_cloud`

exists, then the bridge estimation will not run.`old_model::Union{AbstractDSGEModel, AbstractVARModel} = m`

: model object from which we can build the old log-likelihood function for a time tempered SMC estimation. It should be possible to evaluate the old log-likelihood given`old_data`

and the current draw of parameters. This may be nontrivial if, for example,*new*parameters have been added to`m`

since the old estimation. In this case,`old_model`

should include the*new*parameters but still return the old log-likelihood as the original estimation if given the same*old*parameters. By default, we assume the log-likelihood function has not changed and therefore coincides with the current one.`filestring_addl::Vector{String} = []`

: Additional strings to add to the file name of estimation output as a way to distinguish output from each other.`continue_intermediate::Bool = false`

: set to true if the estimation is starting from an intermediate stage that has been previously saved.`intermediate_stage_start::Int = 0`

: number of the stage from which the user wants to continue the estimation (see`continue_intermediate`

)`save_intermediate::Bool = true`

: set to true to save intermediate stages when using SMC`intermediate_stage_increment::Int = 10`

: number of stages that must pass before saving another intermediate stage.`run_csminwel::Bool = true`

: by default, csminwel is run after a SMC estimation finishes to recover the true mode of the posterior. Set to false to avoid this step (csminwel can take hours for medium-scale DSGE models).`toggle::Bool = true`

: when regime-switching, several functions assume regimes are toggled to regime 1. If the likelihood function is not written to toggle to regime 1 when done, then regime-switching estimation will not work properly. Set to`false`

to reduce computation time if the user is certain that the likelihood is written properly.`log_prob_old_data::Float64 = 0.0`

:Log p( ilde y) which is the log marginal data density of the bridge estimation.

## Metropolis-Hastings Sampler

### Computing the Posterior

In DSGE.jl, the function `posterior`

computes the value of the posterior distribution at a given parameter vector. It calls the `likelihood`

function, which in turn calls the `filter`

routine. See Estimation routines for more details on these functions.

We implement the Kalman Filter via the `filter`

function to compute the log-likelihood, and add this to the log prior to obtain the log posterior. See StateSpaceRoutines.jl for a model-independent implementation of the Kalman filter.

### Optimizing or Reoptimizing

Generally, the user will want to reoptimize the parameter vector (and consequently, calculate the Hessian at this new mode) every time they conduct posterior sampling; that is, when:

- the input data are updated with a new quarter of observations or revised
- the model sub-specification is changed
- the model is derived from an existing model with different equilibrium conditions or measurement equation.

This behavior can be controlled more finely.

#### Reoptimize from a Specified Starting Vector

Reoptimize the model starting from the parameter values supplied in a specified file. Ensure that you supply an HDF5 file with a variable named `params`

that is the correct dimension and data type.

```
m = Model990()
params = load_parameters_from_file(m, "path/to/parameter/file.h5")
update!(m, params)
estimate(m)
```

#### Skip Reoptimization Entirely

You can provide a modal parameter vector and optionally a Hessian matrix calculated at that mode to skip the reoptimization entirely. These values are usually computed by the user previously.

You can skip reoptimization of the parameter vector entirely.

```
m = Model990()
specify_mode!(m, "path/to/parameter/mode/file.h5")
estimate(m)
```

The `specify_mode!`

function will update the parameter vector to the mode and skip reoptimization by setting the `reoptimize`

model setting. Ensure that you supply an HDF5 file with a variable named `params`

that is the correct dimension and data type. (See also the utility function `load_parameters_from_file`

.)

#### Random Walk Metropolis-Hastings

For relatively simple problems, a random walk MH sampler is sufficient and avoids unnecessary computations like calculating the Hessian. Suppose we want to estimate all the parameters of a DSGE model `m`

. The following code implements RWMH for `m`

.

```
m = Model990()
m <= Setting(:hessian_path, "path/to//matrix/with/right/dimensions/saved/as/mh_hessian.h5")
estimate(m; proposal_covariance = Matrix{Float64}(I,size(m.parameters)))
```

The saved Hessian needs to have the same dimensions as the number of parameters. The simplest option is save an identity matrix. The proposal covariance also needs to have the same dimensions as the number of parameters. If the user does not want to estimate every parameter (i.e. some parameters are fixed), then the user needs to zero out the rows of the proposal covariance that correspond to fixed parameters.

### Calculating the Hessian

By default, `estimate`

will recompute the Hessian matrix. You can skip calculation of the Hessian matrix entirely if you provide a file with a Hessian that has been pre-computed.

```
m = Model990()
specify_mode!(m, "path/to/parameter/mode/file.h5")
specify_hessian(m, "path/to/Hessian/matrix/file.h5")
estimate(m)
```

The `specify_hessian`

function will cause `estimate`

to read in the Hessian matrix rather than calculating it directly. Ensure that you supply an HDF5 file with a variable named `hessian`

that is the correct dimension and data type. Specifying the Hessian matrix but *not* the parameter mode results in undefined behavior.

See [Hessian Approximation] for more details on the Hessian computation.

### Estimation routines

#### Prior, Likelihood and Posterior calculations

`ModelConstructors.prior`

— Function`prior(m::AbstractDSGEModel{T})`

Calculates log joint prior density of m.parameters.

`DSGE.likelihood`

— Function```
likelihood(m::AbstractDSGEModel, data::Matrix{T};
sampler::Bool = false, catch_errors::Bool = false) where {T<:AbstractFloat}
```

Evaluate the DSGE log-likelihood function. Can handle two-part estimation where the observed sample contains both a normal stretch of time (in which interest rates are positive) and a stretch of time in which interest rates reach the zero lower bound. If there is a zero-lower-bound period, then we filter over the 2 periods separately. Otherwise, we filter over the main sample all at once.

**Arguments**

`m`

: The model object`data`

: matrix of data for observables

**Optional Arguments**

`sampler`

: Whether metropolis_hastings or smc is the caller. If`sampler=true`

, the transition matrices for the zero-lower-bound period are returned in a dictionary.`catch_errors`

: If`sampler = true`

,`GensysErrors`

should always be caught.

```
likelihood(m::AbstractVARModel, data::Matrix{T};
sampler::Bool = false, catch_errors::Bool = false,
verbose::Symbol = :high) where {T<:AbstractFloat}
```

Evaluate a VAR likelihood function.

**Arguments**

`m`

: The model object`data`

: matrix of data for observables

**Optional Arguments**

`sampler`

: Whether metropolis_hastings or smc is the caller. If`sampler=true`

, the transition matrices for the zero-lower-bound period are returned in a dictionary.`catch_errors`

: If`sampler = true`

,`GensysErrors`

should always be caught.

`ModelConstructors.posterior`

— Function```
posterior(m::Union{AbstractDSGEModel{T},AbstractVARModel{T}}, data::Matrix{T};
sampler::Bool = false, catch_errors::Bool = false,
φ_smc::Float64 = 1) where {T<:AbstractFloat}
```

Calculates and returns the log of the posterior distribution for `m.parameters`

:

```
log posterior = log likelihood + log prior + const
log Pr(Θ|data) = log Pr(data|Θ) + log Pr(Θ) + const
```

**Arguments**

`m`

: the model object`data`

: matrix of data for observables

**Optional Arguments**

-`sampler`

: Whether metropolis*hastings or smc is the caller. If sampler=true, the log likelihood and the transition matrices for the zero-lower-bound period are also returned. -`catch*errors

`: Whether to catch errors of type`

GensysError`or`

ParamBoundsError``φ_smc`

: a tempering factor to change the relative weighting of the prior and the likelihood when calculating the posterior. It is used primarily in SMC.

`ModelConstructors.posterior!`

— Function```
posterior!(m::Union{AbstractDSGEModel{T},AbstractVARModel{T}},
parameters::Vector{T}, data::Matrix{T};
sampler::Bool = false, catch_errors::Bool = false,
ϕ_smc::Float64 = 1., toggle::Bool = true) where {T<:AbstractFloat}
```

Evaluates the log posterior density at `parameters`

.

**Arguments**

`m`

: The model object`parameters`

: New values for the model parameters`data`

: Matrix of input data for observables

**Optional Arguments**

`sampler`

: Whether metropolis_hastings or smc is the caller. If`sampler=true`

, the log likelihood and the transition matrices for the zero-lower-bound period are also returned.`catch_errors`

: Whether to catch errors of type`GensysError`

or`ParamBoundsError`

If`sampler = true`

, both should always be caught.`ϕ_smc`

: a tempering factor to change the relative weighting of the prior and the likelihood when calculating the posterior. It is used primarily in SMC.`toggle`

: if true, we call`ModelConstructors.toggle_regime!(values)`

before updating any values to ensure the`value`

field of the parameters in`values`

correspond to regime 1 values.

#### Optimization

See Optimization

#### Full Estimation Routine

See `estimate`

#### Output Analysis

`DSGE.compute_Eλ`

— Method```
compute_Eλ(m, h, λvec, θmat = [], weights = [];
current_period = true, parallel = true) where T<:AbstractFloat
```

Computes and samples from the conditional density p(λ*t|θ, I*t, P) for particle in `θs`

, which represents the posterior distribution.

**Inputs**

`m::PoolModel{T}`

:`PoolModel`

object`h::Int64`

: forecast horizon`λvec::Vector{T}`

: vector of particles of λ samples from (θ,λ) joint distribution- `θmat::Matrix{T}': matrix of posterior parameter samples
`weights::Vector{T}`

: weights of λ particles, defaults to equal weights

**Keyword Argument**

`current_period::Bool`

: compute Eλ for current period t`parallel::Bool`

: use parallel computing to compute and sample λ`get_dpp_pred_dens::Bool`

: compute predictive densities according to dynamic prediction pools

**Outputs**

`λhat_tplush::Float64`

: E[λ*{t+h|t} | I*t^P, P]`λhat_t::Float64`

: E[λ*{t|t} | I*t^P, P]

```

`DSGE.find_density_bands`

— Method`find_density_bands(draws::Matrix, percents::Vector{T}; minimize::Bool=true) where T<:AbstractFloat`

Returns a `2`

x `cols(draws)`

matrix `bands`

such that `percent`

of the mass of `draws[:,i]`

is above `bands[1,i]`

and below `bands[2,i]`

.

**Arguments**

`draws`

: Matrix of parameter draws (from Metropolis-Hastings, for example)`percent`

: percent of data within bands (e.g. .9 to get 90% of mass within bands)

**Optional Arguments**

`minimize`

: if`true`

, choose shortest interval, otherwise just chop off lowest and highest (percent/2)

`DSGE.find_density_bands`

— Method`find_density_bands(draws::Matrix, percent::AbstractFloat; minimize::Bool=true)`

Returns a `2`

x `cols(draws)`

matrix `bands`

such that `percent`

of the mass of `draws[:,i]`

is above `bands[1,i]`

and below `bands[2,i]`

.

**Arguments**

`draws`

:`ndraws`

by`nperiods`

matrix of parameter draws (from Metropolis-Hastings, for example)`percent`

: percent of data within bands (e.g. .9 to get 90% of mass within bands)

**Optional Arguments**

`minimize`

: if`true`

, choose shortest interval, otherwise just chop off lowest and highest (percent/2)

`DSGE.load_posterior_moments`

— Method`function load_posterior_moments(m; load_bands = true, include_fixed = false)`

Load posterior moments (mean, std) of parameters for a particular sample, and optionally also load 5% and 95% lower and upper bands.

**Keyword Arguments**

`cloud::ParticleCloud`

: Optionally pass in a cloud that you want to load the sample from. If the cloud is non-empty then the model object will only be used to find fixed indices and parameter tex labels`load_bands::Bool`

: Optionally include the 5% and 95% percentiles for the sample of parameters in the returned df`include_fixed::Bool`

: Optionally include the fixed parameters in the returned df`excl_list::Vector{Symbol}`

: List parameters by their key that you want to exclude from

loading

**Outputs**

`df`

: A dataframe containing the aforementioned moments/bands

`DSGE.moment_tables`

— Method```
moment_tables(m; percent = 0.90, subset_inds = 1:0, subset_string = "",
groupings = Dict{String, Vector{Parameter}}(), use_mode = false,
tables = [:prior_posterior_means, :moments, :prior, :posterior],
caption = true, outdir = "", verbose = :none)
```

Computes prior and posterior parameter moments. Tabulates prior mean, posterior mean, and bands in various LaTeX tables. These tables will be saved in `outdir`

if it is nonempty, or else in `tablespath(m, "estimate")`

.

**Inputs**

`m::AbstractDSGEModel`

: model object

**Keyword Arguments**

`percent::AbstractFloat`

: the percentage of the mass of draws from Metropolis-Hastings included between the bands displayed in output tables.`subset_inds::AbstractRange{Int64}`

: indices specifying the draws we want to use`subset_string::String`

: short string identifying the subset to be appended to the output filenames. If`subset_inds`

is nonempty but`subset_string`

is empty, an error is thrown`groupings::Dict{String, Vector{Parameter}}`

: see`?parameter_groupings`

`use_mode::Bool`

: use the modal parameters instead of the mean in the prior*posterior*means table`tables::Vector{Symbol}`

: which tables to produce`caption::Bool`

: whether to include table captions`outdir::String`

: where to save output tables`verbose::Symbol`

: desired frequency of function progress messages printed to standard out. One of`:none`

,`:low`

, or`:high`

`DSGE.sample_λ`

— Method```
sample_λ(m, pred_dens, θs, T = -1; parallel = true) where S<:AbstractFloat
sample_λ(m, pred_dens, T = -1; parallel = true) where S<:AbstractFloat
```

Computes and samples from the conditional density p(λ*t|θ, I*t, P) for particle in `θs`

, which represents the posterior distribution. The sampled λ particles represent the posterior distribution p(λ*{t|t} | I*t, P).

If no posterior distribution is passed in, then the function computes the distribution of λ_{t|t} for a static pool.

**Inputs**

`m::PoolModel{S}`

:`PoolModel`

object`pred_dens::Matrix{S}`

: matrix of predictive densities`θs::Matrix{S}`

: matrix of particles representing posterior distribution of θ`T::Int64`

: final period for tempered particle filter

where `S<:AbstractFloat`

.

**Keyword Argument**

`parallel::Bool`

: use parallel computing to compute and sample draws of λ

**Outputs**

`λ_sample::Vector{Float64}`

: sample of draws of λs; together with (θ,λ) represents a joint density

```

`DSGE.moments`

— Method`moments(θ::Parameter)`

If θ's prior is a `RootInverseGamma`

, τ and ν. Otherwise, returns the mean and standard deviation of the prior. If θ is fixed, returns `(θ.value, 0.0)`

.

`DSGE.posterior_table`

— Method```
posterior_table(m, post_means, post_bands; percent = 0.9, subset_string = "",
groupings = Dict{String, Vector{Parameter}}(), caption = true, outdir = "")
```

`DSGE.prior_posterior_moments_table`

— Method```
prior_posterior_moments_table(m, post_means, post_bands; percent = 0.9,
subset_string = "", groupings = Dict{String, Vector{Parameter}}(),
caption = true, outdir = "")
```

Produces a table of prior means, prior standard deviations, posterior means, and 90% bands for posterior draws.

`DSGE.prior_posterior_table`

— Method```
prior_posterior_table(m, post_values; subset_string = "",
groupings = Dict{String, Vector{Parameter}}(), use_mode = false,
caption = true, outdir = "")
```

Produce a table of prior means and posterior means or mode.

`DSGE.prior_table`

— Method```
prior_table(m; subset_string = "", groupings = Dict{String, Vector{Parameter}}(),
caption = true, outdir = "")
```

`DSGE.propagate_λ`

— Method`propgate_λ(λvec, h, m, θvec) where T<:AbstractFloat`

Propagates a λ particle h periods forward.

**Inputs**

`λ::T`

: λ sample from (θ,λ) joint distribution`h::Int64`

: forecast horizon`m::PoolModel`

: PoolModel object`θvec::Vector{T}`

: optional vector of parameters to update PoolModel

```

## SMC Sampler

See here for the settings adjusting the SMC algorithm. To use these with a DSGE model object, either add them after the definition of a model object as a `Setting`

or add them directly in the definition of the model type. For example, the following code sets the sampler as SMC and the number of particles used by SMC as 10,000:

```
m = Model990()
m <= Setting(:sampling_method, :SMC)
m <= Setting(:n_particles, 10000)
```