Input Data

Given all of the hard work put into specifying the model, one should be able to maintain the input data painlessly. To that extent, DSGE.jl provides facilities to download appropriate vintages of data series from FRED (Federal Reserve Economic Data).

Note that a sample input dataset for use with model m990 is provided; see New York Fed Model 990 Data for more details. To update this sample dataset for use with model m990, see Update sample input data.

Setup

To take advantage of the ability to automatically download data series from FRED via the FredData.jl package, set up your FRED API access by following the directions here.

Loading data

At the most basic, loading data looks like this:

m = Model990()
df = load_data(m)

By default, load_data will look on the disk first to see if an appropriate vintage of data is already present. If data on disk are not present, or if the data are invalid for any reason, a fresh vintage will be downloaded from FRED and merged with the other data sources specified. See load_data for more details.

The resulting DataFrame df contains all the required data series for this model, fully transformed. The first row is given by the Setting date_presample_start and the last row is given by date_mainsample_end. The first n_presample_periods rows of df are the presample.

Driver functions including estimate accept this df as an argument and convert it into a Matrix suitable for computations using df_to_matrix, which sorts the data, ensures the full sample is present, discards the date column, and sorts the observable columns according to the observables field of the model object.

Non-FRED data sources

Some data series may not be available from FRED or one may simply wish to use a different data source, for whatever reason. The data sources and series are specified in the input_series field of an Observable object (see ModelConstructors.jl). For each data source that is not :fred, a well-formed CSV of the form <source>_<yymmdd>.csv is expected in the directory indicated by inpath(m, "raw"). For example, the following might be the contents of a data source for two series :series1 and :series2:

date,series1,series2
1959-06-30,1.0,NaN
1959-09-30,1.1,0.5
# etc.

Note that quarters are represented by the date of the last day of the quarter and missing values are specified by NaN.

Example

Let's consider an example dataset comprised of 10 macro series sourced from FRED and one survey-based series sourced from, say, the Philadelphia Fed's Survey of Professional Forecasters via Haver Analytics. The Observable for that data series might look like this:

Observable(:obs_longcpi, [:ASAXC10__SPF], annualtoquarter, quartertoannual,
           "Median 10Y CPI Expectations", "Median 10Y CPI Expectations")

If the data vintage specified for the model is 151127 (Nov. 27, 2015), then the following files are expected in inpath(m, "raw"):

spf_151127.csv
fred_151127.csv

The FRED series will be downloaded and the fred_151127.csv file will be automatically generated, but the spf_151127.csv file must be manually compiled as shown above:

date,ASACX10
1991-12-31,4.0
# etc.

Now, suppose that we set the data vintage to 151222, to incorporate the BEA's third estimate of GDP. The fred_151222.csv file will be downloaded, but there are no updates to the SPF dataset during this period. Regardless, the file spf_151222.csv must be present to match the data vintage. The solution in this case is to manually copy and rename the older SPF dataset. Although this is not an elegant approach, it is consistent with the concept of a vintage as the data available at a certain point in time –- in this example, it just so happens that the SPF data available on Nov. 27 and Dec. 22 are the same.

Incorporate population forecasts

Many variables enter the model in per-capita terms. To that extent, we use data on population levels to adjust aggregate variables into per-capita variables. Furthermore, we apply the Hodrick-Prescott filter ("H-P filter") to the population levels to smooth cyclical components.

The user will ultimately want to produce forecasts of key variables such as GDP and then represent these forecasts in standard terms. That is, one wants to report GDP forecasts in aggregate terms, which is standard, rather than per-capita terms. To do this, we either extrapolate from the last periods of population growth in the data, or use external population forecasts.

Note that if external population forecasts are provided, non-forecast procedures, such as model estimation, are also affected because the H-P filter smoothes back from the latest observation.

To incorporate population forecasts,

  1. Set the model setting use_population_forecast to true.
  2. Provide a file population_forecast_<yymmdd>.csv to inpath(m, "raw"). Population forecasts should be in levels, and represent the same series as given by the population_mnemonic setting (defaults to :CNP16OV, or "Civilian Noninstitutional Population, Thousands"). If your population forecast is in growth rates, convert it to levels yourself. The first row of data should correspond to the last period of the main sample, such that growth rates can be computed. As many additional rows of forecasts as desired can be provided.

The file should look like this:

date,POPULATION
2015-12-31,250000
2016-03-31,251000
# etc.

Dataset creation implementation details

Let's quickly walk through the steps DSGE.jl takes to create a suitable dataset.

First, a user provides a detailed specification of the data series and transformations used for their model.

  • the user specifies m.observables; the keys of this dictionary name the series to be used in estimating the model.

  • the user specifies m.observable_mappings; the keys of this dictionary name observed variables, and the values correspond to the observable object, which contains information about the forward and reverse transforms as well as the input data series from which the observable is constructed.

  • For a given observable, an input series, e.g. m.observable_mappings[:obs_gdp].input_series, is an array of mnemonics to be accessed from the data source listed after the mnemonic (separated by the double underscore). Note that these mnemonics do not correspond to observables one-to-one, but rather are usually series in levels that will be further transformed.

  • There are also both forward and reverse transforms for a given observable, e.g. m.observable_mappings[:obs_gdp].fwd_transform and m.observable_mappings[:obs_gdp].rev_transform. The forward transform operates on a single argument, levels, which is a DataFrame of the data in levels returned by the function load_data_levels. The reverse transform operates on a forward transformed series (which is in model units) transforming it into human-readable units, such as one quarter percent changes or per-capita adjustments. Both transforms return a DataArray for a single series. These functions could do nothing, or they could perform a more complex transformation. See Data Transforms and Utilities for more information about series-specific transformations.

  • the user adjusts data-related settings, such as data_vintage, data_id, dataroot, date_presample_start, date_zlb_start, date_forecast_start, and use_population_forecast. See Working with Settings for details.

Second, DSGE.jl attempts to construct the dataset given this setup through a call to load_data. See load_data for more details.

  • Intermediate data in levels are loaded. See load_data_levels for more details.
  • Transformations are applied to the data in levels. See transform_data for more details.
  • The data are saved to disk. See save_data for more details.

Conditional data

The user can easily add conditional data for any observables. By "conditional data", we mean that, in reality, some data has not become available yet, but we believe that a certain number is a decent guess, so we want to forecast conditional on our guessed data. For example, suppose we are in 2019:Q4, in which case we have not observed 2019:Q4 GDP growth yet. However, we might have some idea of the number, so we want our forecasts to be conditional on that guess.

To load such data, the user needs to include a "cond" folder within the input data folder, i.e. this folder joinpath(get_setting(m, :input_data), "cond") should exist. Within this folder, the user can create a csv file taking the form cond_cdid=<xx>_cdvt=<yymmdd>.csv. The user should then make sure that the model object being used has the following settings

  • cond_id::Int64: the conditional data's equivalent of data_id and will be inserted after the cdid. Note that the ID must be less than 100.

  • cond_vintage::String: the conditional data's equivalent of data_vintage and will be inserted after the cdvt.

The contents of cond_cdid=<xx>_cdvt=<yymmdd>.csv should have columns for each raw data series that is then used to construct a given conditional observable. The first column should be date for the quarters of the conditional horizon, and the following columns should be for the raw data series. For example, to obtain real GDP growth, we need to have a population forecast file with both CNP16OV and CE16OV, the forecasted value of nominal GDP (under pnemonic GDP), and the forecasted value of the GDP deflator (under pnemonic GDPDEF) since these series are all required to compute obs_gdp, which is per-capita real GDP growth. For core inflation, we just need the index level for core PCE (under pnemonic PCEPILFE).

Note that the csv should have only conditional horizon data. If you have data for any historical quarters, then the DataFrame with both historical and conditional data will not be created in REPL correctly. For example, if I am forecasting 2019:Q4 with a conditional forecast of 2019:Q4 values, then the data conditional csv should have only values for 2019:Q4 (and onward). No values for 2019:Q3 or before should be in the conditional data csv.

Finally, to specify which variables should have conditional observations, make sure to set

  • cond_full_names::Vector{Symbol}: variables when running a "full" conditional forecast. For Model 1002, this means averages of the current quarter's daily financial data as well as nowcasts of real GDP growth and core PCE inflation.

  • cond_semi_names::Vector{Symbol}: variables when running a "semi" conditional forecast. For Model 1002, this means averages of the current quarter's daily financial data.

See the default settings for an example of how these cond_full_names and cond_semi_names are initialized.

Common pitfalls

Given the complexity of the data download, you may find that the dataset generated by load_data is not exactly as you expect. It is a good idea to compare the observables.jl file for your model with the one used by Model1002, which uses all the features provided by the package for handling data. Be certain that any significant differences are intentional. Here are also some common pitfalls to look out for:

  • Ensure that the data_vintage and cond_vintage model settings are as you expect. (Try checking data_vintage(m) and cond_vintage(m).)
  • Ensure that the data_id and cond_id model settings are correct for the given model.
  • Ensure that the date_forecast_start model setting is as you expect, and that is not logically incompatible with data_vintage.
  • Ensure that the date_conditional_end model setting is as you expect, and that is not logically incompatible with cond_vintage.
  • Double check the transformations specified in the data_transforms field of the model object.
  • Ensure that the keys of the observables and data_transforms fields of the model object match.
  • Check the input files for Non-FRED data sources. They should be in the directory indicated by inpath(m, "raw"), be named appropriately given the vintage of data expected, and be formatted appropriately. One may have to copy and rename files of non-FRED data sources to match the specified vintage, even if the contents of the files would be identical.
  • Look for any immediate issues in the final dataset saved (data_dsid=<xx>_vint=<yymmdd>.csv). If a data series in this file is all NaN values, then likely a non-FRED data source was not provided correctly.
  • Ensure that the column names of the data CSV match the keys of the observables field of the model object.
  • You may receive a warning that an input data file "does not contain the entire date range specified". This means that observations are not provided for some periods in which the model requires data. This is perfectly okay if your data series starts after date_presample_start.
  • If you successfully created a data set but it is missing observations that you want to add, you may need to recreate the data set. By default, load_data checks if a data set with the correct vintage already exists. If it does, then load_data loads the saved data rather than recreate a data set from scratch. However, if the saved data set is missing observations, then you want to recreate it by calling load_data(m; try_disk = false).
  • If you have a column that is completely empty (all missing/NaN data), but you still want to load the data, then use the keyword check_empty_columns = false.

If you experience any problems using FredData.jl, ensure your API key is provided correctly and that there are no issues with your firewall, etc. Any issues with FredData.jl proper should be reported on that project's page.

Update sample input data

A sample dataset is provided for the 2015 Nov 27 vintage. To update this dataset:

Step 1. See Setup to setup automatic data pulls using FredData.jl.

Step 2. Specify the exact data vintage desired:

julia>  m <= Setting(:data_vintage, "yymmdd")

Step 3. Create data files for the non-FRED data sources. For model m990, the required data files include spf_<yymmdd>.csv (with column ASACX10), longrate_<yymmdd>.csv (with column FYCCZA), and fernald_<yymmdd>.csv (with columns TFPJQ and TFPKQ). To include data on expected interest rates, the file ois_<yymmdd>.csv is also required. To include data on population forecasts, the file population_forecst_<yymmdd>.csv is also required (see Incorporate population forecasts. See New York Fed Model Input Data for details on the series used and links to data sources.

Step 4. Run load_data(m); series from FRED will be downloaded and merged with the series from non-FRED data sources that you have already created. See Common pitfalls for some potential issues.

Data Transforms and Utilities

DSGE.df_to_matrixMethod
df_to_matrix(m, df; cond_type = :none, in_sample = true)

Return df, converted to matrix of floats, and discard date column. Also ensure that rows are sorted by date and columns by m.observables, with the option to specify whether or not the out of sample rows are discarded. The output of this function is suitable for direct use in estimate, posterior, etc.

Keyword Arguments:

  • include_presample::Bool: indicates whether or not there are presample periods.
  • in_sample::Bool: indicates whether or not to discard rows that are out of sample. Set this flag to false in

the case that you are calling filter_shocks! in the scenarios codebase.

source
DSGE.load_cond_data_levelsMethod
load_cond_data_levels(m::AbstractDSGEModel; verbose::Symbol=:low)

Check on disk in inpath(m, "cond") for a conditional dataset (in levels) of the correct vintage and load it.

The following series are also loaded from inpath(m, "raw") and either appended or merged into the conditional data:

  • The last period of (unconditional) data in levels (data_levels_<yymmdd>.csv), used to calculate growth rates
  • The first period of forecasted population (population_forecast_<yymmdd>.csv), used for per-capita calculations
source
DSGE.load_dataMethod
load_data(m::AbstractDSGEModel; try_disk::Bool = true, verbose::Symbol = :low,
          check_empty_columns::Bool = true, summary_statistics::Symbol = :low)

Create a DataFrame with all data series for this model, fully transformed.

First, check the disk to see if a valid dataset is already stored in inpath(m, "data"). A dataset is valid if every series in m.observable_mappings is present and the entire sample is contained (from date_presample_start to date_mainsample_end. If no valid dataset is already stored, the dataset will be recreated. This check can be eliminated by passing try_disk=false.

If the dataset is to be recreated, in a preliminary stage, intermediate data series as specified in m.observable_mappings are loaded in levels using load_data_levels. See ?load_data_levels for more details.

Then, the series in levels are transformed as specified in m.observable_mappings. See ?transform_data for more details.

If m.testing is false, then the resulting DataFrame is saved to disk as data_<yymmdd>.csv. The data are then returned to the caller.

The keyword check_empty_columns throws an error whenever a column is completely empty in the loaded data set if it is set to true.

The keyword summary_statistics prints out a variety of summary statistics on the loaded data. When set to :low, we print only the number of missing/NaNs for each data series. When set to :high, we also print means, standard deviations,

source
DSGE.load_data_levelsMethod
load_data_levels(m::AbstractDSGEModel; verbose::Symbol=:low)

Load data in levels by appealing to the data sources specified for the model. Data from FRED is loaded first, by default; then, merge other custom data sources.

Check on disk in inpath(m, "data") datasets, of the correct vintage, corresponding to the ones required by the entries in m.observable_mappings. Load the appropriate data series (specified in m.observable_mappings[key].input_series) for each data source.

To accomodate growth rates and other similar transformations, more rows of data may be downloaded than otherwise specified by the date model settings. (By the end of the process, these rows will have been dropped.)

Data from FRED (i.e. the :fred data source) are treated separately. These are downloaded using load_fred_data. See ?load_fred_data for more details.

Data from non-FRED data sources are read from disk, verified, and merged.

source
DSGE.parse_data_seriesMethod
parse_data_series(m::AbstractDSGEModel)

Parse m.observable_mappings for the data sources and mnemonics to read in.

Returns a Dict{Symbol, Vector{Symbol}} mapping sources => mnemonics found in that data file.

source
DSGE.save_dataMethod
save_data(m::AbstractDSGEModel, df::DataFrame; cond_type::Symbol = :none)

Save df to disk as CSV. File is located in inpath(m, "data").

source
DSGE.load_fred_dataMethod
load_fred_data(m::AbstractDSGEModel; start_date="1959-03-31", end_date=prev_quarter())

Checks in inpath(m, raw) for a FRED dataset corresponding to data_vintage(m). If a FRED vintage exists on disk, any required FRED series that is contained therein will be imported. All missing series will be downloaded directly from FRED using the FredData package. The full dataset is written to the appropriate data vintage file and returned.

Arguments

  • m::AbstractDSGEModel: the model object
  • start_date: starting date.
  • end_date: ending date.

Notes

The FRED API reports observations according to the quarter-start date. load_fred_data returns data indexed by quarter-end date for compatibility with other datasets.

source
DSGE.transform_dataMethod
transform_data(m::AbstractDSGEModel, levels::DataFrame; cond_type::Symbol = :none,
    verbose::Symbol = :low)

Transform data loaded in levels and order columns appropriately for the DSGE model. Returns DataFrame of transformed data.

The DataFrame levels is output from load_data_levels. The series in levels are transformed as specified in m.observable_mappings.

  • To prepare for per-capita transformations, population data are filtered using hpfilter. The series in levels to use as the population series is given by the population_mnemonic setting. If use_population_forecast(m), a population forecast is appended to the recorded population levels before the filtering. Both filtered and unfiltered population levels and growth rates are added to the levels data frame.
  • The transformations are applied for each series using the levels DataFrame as input.

Conditional data (identified by cond_type in [:semi, :full]) are handled slightly differently: If use_population_forecast(m), we drop the first period of the population forecast because we treat the first forecast period date_forecast_start(m) as if it were data. We also only apply transformations for the observables given in cond_full_names(m) or cond_semi_names(m).

source
DSGE.hpfilterMethod
yt, yf = hpfilter(y, λ)

Applies the Hodrick-Prescott filter ("H-P filter"). The smoothing parameter λ is applied to the columns of y, returning the trend component yt and the cyclical component yf. For quarterly data, one can use λ=1600.

Consecutive missing values at the beginning or end of the time series are excluded from the filtering. If there are missing values within the series, the filtered values are all missing.

See also:

Hodrick, Robert; Prescott, Edward C. (1997). "Postwar U.S. Business Cycles: An Empirical
Investigation". Journal of Money, Credit, and Banking 29 (1): 1–16.
source
DSGE.loggrowthtopctMethod
loggrowthtopct(y)

Transform from annualized quarter-over-quarter log growth rates to annualized quarter-over-quarter percent change.

Note

This should only be used in Model 510, which has the core PCE inflation observable in annualized log growth rates.

source
DSGE.loggrowthtopct_4q_approxFunction
loggrowthtopct_4q_approx(y, data = fill(NaN, 3))

Transform from log growth rates to approximate 4-quarter percent change.

This method should only be used to transform scenarios forecasts, which are in deviations from baseline.

Inputs

  • y: the data we wish to transform to aggregate 4-quarter percent change from log per-capita growth rates. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • data: if y = [y_t, y_{t+1}, ..., y_{t+nperiods-1}], then data = [y_{t-3}, y_{t-2}, y_{t-1}]. This is necessary to compute 4-quarter percent changes for the first three periods.

source
DSGE.loggrowthtopct_annualized_percapitaMethod
loggrowthtopct_annualized_percapita(y, pop_growth)

Transform from log per-capita growth rates to annualized aggregate (not per-capita) quarter-over-quarter percent change.

Note

This should only be used for output, consumption, investment and GDP deflator (inflation).

Inputs

  • y: the data we wish to transform to annualized percent change from quarter-over-quarter log growth rates. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • pop_growth::Vector: the length nperiods vector of log population growth rates.

source
DSGE.loggrowthtopct_percapitaMethod
loggrowthtopct_percapita(y, pop_growth)

Transform from annualized quarter-over-quarter log per-capita growth rates to annualized quarter-over-quarter aggregate percent change.

Note

This should only be used in Model 510, which has the output growth observable in annualized log per-capita growth rates.

Inputs

  • y: the data we wish to transform to annualized percent change from annualized log growth rates. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • pop_growth::Vector: the length nperiods vector of log population growth rates.

source
DSGE.logleveltopct_4q_approxFunction
logleveltopct_4q_approx(y, data = fill(NaN, 4))

Transform from log levels to approximate 4-quarter percent change.

This method should only be used to transform scenarios forecasts, which are in deviations from baseline.

Inputs

  • y: the data we wish to transform to 4-quarter percent change from log levels. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • data: if y = [y_t, y_{t+1}, ..., y_{t+nperiods-1}], then data = [y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}]. This is necessary to compute 4-quarter percent changes for the first three periods.

source
DSGE.logleveltopct_annualizedFunction
logleveltopct_annualized(y, y0 = NaN)

Transform from log levels to annualized quarter-over-quarter percent change.

Inputs

  • y: the data we wish to transform to annualized quarter-over-quarter percent change from log levels. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • y0: the last data point in the history (of state or observable) corresponding to the y variable. This is required to compute a percent change for the first period.

source
DSGE.logleveltopct_annualized_approxFunction
logleveltopct_annualized_approx(y, y0 = NaN)

Transform from log levels to approximate annualized quarter-over-quarter percent change.

This method should only be used to transform scenarios forecasts, which are in deviations from baseline.

Inputs

  • y: the data we wish to transform to annualized quarter-over-quarter percent change from log levels. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • y0: the last data point in the history (of state or observable) corresponding to the y variable. This is required to compute a percent change for the first period.

source
DSGE.logleveltopct_annualized_percapitaFunction
logleveltopct_annualized_percapita(y, pop_growth, y0 = NaN)

Transform from per-capita log levels to annualized aggregate (not per-capita) quarter-over-quarter percent change.

Note

This is usually applied to labor supply (hours worked per hour), and probably shouldn't be used for any other observables.

Inputs

  • y: the data we wish to transform to annualized aggregate quarter-over-quarter percent change from per-capita log levels. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • pop_growth::Vector: the length nperiods vector of log population growth rates.

  • y0: The last data point in the history (of state or observable) corresponding to the y variable. This is required to compute a percent change for the first period.

source
DSGE.nominal_to_realMethod

nominal_to_real(col, df; deflator_mnemonic = :GDPDEF)

Converts nominal to real values using the specified deflator.

Arguments

  • col: Symbol indicating which column of df to transform
  • df: DataFrame containining series for proper population measure and col

Keyword arguments

  • deflator_mnemonic: indicates which deflator to use to calculate real values. Default value is the FRED GDP Deflator mnemonic.
source
DSGE.percapitaMethod
percapita(m, col, df)
percapita(col, df, population_mnemonic)

Converts data column col of DataFrame df to a per-capita value.

The first method checks hpfilter_population(m). If true, then it divides by the filtered population series. Otherwise it divides by the result of parse_population_mnemonic(m)[1].

Arguments

  • col: Symbol indicating which column of data to transform
  • df: DataFrame containining series for proper population measure and col
  • population_mnemonic: a mnemonic found in df for some population measure
source
DSGE.get_data_filenameMethod
get_data_filename(m, cond_type)

Returns the data file for m, which depends on data_vintage(m), and if cond_type in [:semi, :full], also on cond_vintage(m) and cond_id(m).

source
DSGE.iterate_quartersMethod
iterate_quarters(start::Date, quarters::Int)

Returns the date corresponding to start + quarters quarters.

Inputs

  • start: starting date
  • quarters: number of quarters to iterate forward or backward
source
DSGE.quartertodateMethod

quartertodate(string::String)

Convert string in the form "YYqX", "YYYYqX", or "YYYY-qX" to a Date of the end of the indicated quarter. "X" is in {1,2,3,4} and the case of "q" is ignored.

source
DSGE.subtract_quartersMethod

subtract_quarters(t1::Date, t0::Date)

Compute the number of quarters between t1 and t0, including t0 and excluding t1.

source
DSGE.data_to_dfMethod
data_to_df(m, data, start_date)

Create a DataFrame out of the matrix data, including a :date column beginning in start_date. Variable names and indices are obtained from m.observables.

source
DSGE.has_saved_dataMethod
has_saved_data(m::AbstractDSGEModel; cond_type::Symbol = :none)

Determine if there is a saved dataset on disk for the required vintage and conditional type.

source
DSGE.isvalid_dataMethod
isvalid_data(m::AbstractDSGEModel, df::DataFrame; cond_type::Symbol = :none,
    check_empty_columns::Bool = true)

Return if dataset is valid for this model, ensuring that all observables are contained and that all quarters between the beginning of the presample and the end of the mainsample are contained. Also checks to make sure that expected interest rate data is available if n_mon_anticipated_shocks(m) > 0.

source
DSGE.read_dataMethod
read_data(m::AbstractDSGEModel; cond_type::Symbol = :none)

Read CSV from disk as DataFrame. File is located in inpath(m, "data").

source
DSGE.read_population_dataMethod
read_population_data(m; verbose = :low)

read_population_data(filename; verbose = :low)

Read in population data stored in levels, either from inpath(m, "raw", "population_data_levels_[vint].csv") or filename.

source
DSGE.read_population_forecastMethod
read_population_forecast(m; verbose = :low)

read_population_forecast(filename, population_mnemonic, last_recorded_date; verbose = :low)

Read in population forecast in levels, either from inpath(m, "raw", "population_forecast_[vint].csv") or filename. If that file does not exist, return an empty DataFrame.

source
DSGE.transform_population_dataMethod
transform_population_data(population_data, population_forecast,
    population_mnemonic; verbose = :low)

Load, HP-filter, and compute growth rates from population data in levels. Optionally do the same for forecasts.

Inputs

  • population_data: pre-loaded DataFrame of historical population data containing the columns :date and population_mnemonic. Assumes this is sorted by date.
  • population_forecast: pre-loaded DataFrame of population forecast containing the columns :date and population_mnemonic
  • population_mnemonic: column name for population series in population_data and population_forecast

Keyword Arguments

  • verbose: one of :none, :low, or :high
  • use_hpfilter: whether to HP filter population data and forecast. See Output below.
  • pad_forecast_start::Bool: Whether you want to re-size

the populationforecast such that the first index is one quarter ahead of the last index of populationdata. Only set to false if you have manually constructed population_forecast to artificially start a quarter earlier, so as to avoid having an unnecessary missing first entry.

Output

Two dictionaries containing the following keys:

  • population_data_out:

    • :filtered_population_recorded: HP-filtered historical population series (levels)
    • :dlfiltered_population_recorded: HP-filtered historical population series (growth rates)
    • :dlpopulation_recorded: Non-filtered historical population series (growth rates)
  • population_forecast_out:

    • :filtered_population_forecast: HP-filtered population forecast series (levels)
    • :dlfiltered_population_forecast: HP-filtered population forecast series (growth rates)
    • :dlpopulation_forecast: Non-filtered historical population series (growth rates)

If population_forecast_file is not provided, the r"forecast" fields will be empty. If use_hpfilter = false, then the r"filtered*" fields will be empty.

source
DSGE.get_irf_transformMethod
get_irf_transform(transform::Function)

Returns the IRF-specific transformation, which doesn't add back population growth (since IRFs are given in deviations).

source
DSGE.get_nopop_transformMethod
get_nopop_transform(transform::Function)

Returns the corresponding transformation which doesn't add back population growth. Used for shock decompositions, deterministic trends, and IRFs, which are given in deviations.

source
DSGE.get_scenario_transformMethod
get_scenario_transform(transform::Function)

Given a transformation used for usual forecasting, return the transformation used for scenarios, which are forecasted in deviations from baseline.

The 1Q deviation from baseline should really be calculated by 1Q transforming the forecasts (in levels) under the baseline (call this y_b) and alternative scenario (y_s), then subtracting baseline from alternative scenario (since most of our 1Q transformations are nonlinear). Let y_d = y_s - y_b. Then, for example, the most correct loggrowthtopct_annualized transformation is:

y_b_1q = 100*(exp(y_b/100)^4 - 1)
y_s_1q = 100*(exp(y_s/100)^4 - 1)
y_d_1q = y_b_1q - y_s_1q

Instead, we approximate this by transforming the deviation directly:

y_d_1q ≈ 4*(y_b - y_s)
source
DSGE.get_transform4qMethod
get_transform4q(transform::Function)

Returns the 4-quarter transformation associated with the annualizing transformation.

source
DSGE.lagMethod
series_lag_n = lag(series, n)

Returns a particular data series lagged by n periods

source
DSGE.loggrowthtopct_4qFunction
loggrowthtopct_4q(y, data = fill(NaN, 3))

Transform from log growth rates to 4-quarter percent change.

Inputs

  • y: the data we wish to transform to aggregate 4-quarter percent change from log per-capita growth rates. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • data: if y = [y_t, y_{t+1}, ..., y_{t+nperiods-1}], then data = [y_{t-3}, y_{t-2}, y_{t-1}]. This is necessary to compute 4-quarter percent changes for the first three periods.

source
DSGE.loggrowthtopct_4q_percapitaFunction
loggrowthtopct_4q_percapita(y, pop_growth, data = fill(NaN, 3))

Transform from log per-capita growth rates to aggregate 4-quarter percent change.

Note

This should only be used for output, consumption, investment, and GDP deflator (inflation).

Inputs

  • y: the data we wish to transform to aggregate 4-quarter percent change from log per-capita growth rates. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • pop_growth::Vector: the length nperiods vector of log population growth rates.

  • data: if y = [y_t, y_{t+1}, ..., y_{t+nperiods-1}], then data = [y_{t-3}, y_{t-2}, y_{t-1}]. This is necessary to compute 4-quarter percent changes for the first three periods.

source
DSGE.logleveltopct_4qFunction
logleveltopct_4q(y, data = fill(NaN, 4))

Transform from log levels to 4-quarter percent change.

Inputs

  • y: the data we wish to transform to 4-quarter percent change from log levels. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • data: if y = [y_t, y_{t+1}, ..., y_{t+nperiods-1}], then data = [y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}]. This is necessary to compute 4-quarter percent changes for the first three periods.

source
DSGE.logleveltopct_4q_percapitaFunction
logleveltopct_4q_percapita(y, pop_growth, data = fill(NaN, 4))

Transform from per-capita log levels to 4-quarter aggregate percent change.

Note

This is usually applied to labor supply (hours worked), and probably shouldn't be used for any other observables.

Inputs

  • y: the data we wish to transform to 4-quarter aggregate percent change from per-capita log levels. y is either a vector of length nperiods or an ndraws xnperiods` matrix.

  • pop_growth::Vector: the length nperiods vector of log population growth rates.

  • data: if y = [y_t, y_{t+1}, ..., y_{t+nperiods-1}], then data = [y_{t-4}, y_{t-3}, y_{t-2}, y_{t-1}]. This is necessary to compute 4-quarter percent changes for the first three periods.

source
DSGE.prepend_dataMethod
prepend_data(y, data)

Prepends data necessary for running 4q transformations.

Inputs:

  • y: ndraws x t array representing a timeseries for variable y
  • data: vector representing a timeseries to prepend to y
source
DSGE.datetoquarterMethod

datetoquarter(date::Date)

Convert string in the form "YYqX", "YYYYqX", or "YYYY-qX" to a Date of the end of the indicated quarter. "X" is in {1,2,3,4} and the case of "q" is ignored.

Return an integer from the set {1,2,3,4}, corresponding to one of the quarters in a year given a Date object.

source
DSGE.datetoymdvecMethod
datetoymdvec(dt)

converts a Date to a vector/matrix holding the year, month, and date.

source
DSGE.format_dates!Method

format_dates!(col, df)

Change column col of dates in df from String to Date, and map any dates given in the interior of a quarter to the last day of the quarter.

source
DSGE.get_quarter_endsMethod

get_quarter_ends(start_date::Date,end_date::Date)

Returns an Array of quarter end dates between start_date and end_date.

source
DSGE.missing2nanMethod
missing2nan(a::Array)

Convert all elements of Union{X, Missing.Missing} or Missing.Missing to type Float64.

source
DSGE.missing_cond_vars!Method
missing_cond_vars!(m, df; cond_type = :none, check_empty_columns = true)

Make conditional period variables not in cond_semi_names(m) or cond_full_names(m) missing if necessary.

source
DSGE.na2nan!Method
na2nan!(df::DataFrame)

Convert all NAs in a DataFrame to NaNs.

source
DSGE.next_quarterFunction

next_quarter(q::TimeType = now())

Returns Date identifying last day of the next quarter

source
DSGE.prev_quarterFunction

prev_quarter(q::TimeType = now())

Returns Date identifying last day of the previous quarter

source
DSGE.reconcile_column_namesMethod
reconcile_column_names(a::DataFrame, b::DataFrame)

adds columns of missings to a and b so that both have the same set of column names.

source
DSGE.vinttodateMethod
function vinttodate(vint)

Return the string given by data_vintage(m), which is in the format YYYYMMDD, to a Date object.

source