Appendix A: Technical Details | RDP 2017-06: Uncertainty and Monetary Policy in Good and Bad Times

RDP 2017-06: Uncertainty and Monetary Policy in Good and Bad Times Appendix A: Technical Details

Giovanni Caggiano, Efrem Castelnuovo and Gabriela Nodari

October 2017

This appendix documents statistical evidence in favour of a nonlinear relationship between the endogenous variables included in our STVAR. It also provides details on the estimation procedure of our nonlinear VARs, and on the computation of the GIRFs.

A.1 Statistical Evidence in Favour of Nonlinearities

To detect nonlinear dynamics at a multivariate level, we apply the test proposed by Teräsvirta and Yang (2014). Their framework is particularly well suited for our analysis since it proposes testing the null hypothesis of linearity versus a specified nonlinear alternative, that of a STVAR with a single transition variable.

Consider the following p-dimensional 2-regime approximate logistic STVAR model:

where X_t is the (p × 1) vector of endogenous variables, Y_t = [X_t _{− 1}|…|X_t ₋ _k|∝] is the ((k × p + q) × 1) vector of exogenous variables (including endogenous variables lagged k times and a column vector of constants ∝), z_t is the transition variable, and Θ₀ and Θ_i are matrices of parameters. In our case, the number of endogenous variables is p = 8, the number of exogenous variables is q = 1, and the number of lags is k = 6. Under the null hypothesis of linearity, Θ_i = 0 ∀i.

The Teräsvirta-Yang test for linearity versus the STVAR model can be performed as follows:

Estimate the restricted model (Θ_i = 0 ∀i) by regressing X_t on Y_t. Collect the residuals E and compute the matrix of residual sum of squares RSS₀ = E’E.
Run an auxiliary regression of E on (Y_t,Z_n) where . Collect the residuals Ξ and compute the matrix residual sum of squares RSS₁ = Ξ’Ξ.
Compute the test statistic

View MathML

Under the null hypothesis, the test statistic is distributed as a χ² with p(kp + q) degrees of freedom. For our model, we get a value of LM = 1992 with a corresponding p-value equal to zero. The LM statistic has been computed by fixing the value of the order of the Taylor expansion n = 3, as suggested by Luukkonen, Saikkonen and Teräsvirta (1988). We note, however, that the null of linearity can be rejected also for n = 2.
As pointed out by Teräsvirta and Yang (2014), however, in small samples the LM-type test might suffer from positive size distortion, that is, the empirical size of the test exceeds the true asymptotic size. We then also employ the following rescaled LM test statistic:

View MathML

where G is the number of restrictions. The rescaled test statistic follows an F(G,pT − k) distribution. In our case, we get F = 13.54, with a p-value approximately equal to zero.

A.2 Estimation of the Nonlinear VARs

Our model (1)–(4) is estimated via maximum likelihood.^[23] Its log-likelihood reads as follows:

where is the vector of residuals. Our goal is to estimate the parameters , . We do so by conditioning on a given value for the smoothness parameter γ, which is calibrated as described in the text. The high nonlinearity of the model and its many parameters make its estimation with standard optimisation routines problematic. Following Auerbach and Gorodnichenko (2012), we employ the procedure described below.

Conditional on {γ,Ω_R,Ω_E}, the model is linear in {∏_R(L),∏_E(L)}. Then for a given guess on {γ,Ω_R,Ω_E}, the coefficients {∏_R(L),∏_E(L)} can be estimated by minimising . This can be seen by re-writing the regressors as follows. Let:

be the extended vector of regressors, and ∏ = [∏_R(L),∏_E(L)]. Then, we can write . Consequently, the objective function becomes:

It can be shown that the first-order condition with respect to ∏ is:

This procedure iterates over different sets of values for {Ω_R, Ω_E}, conditional on a given value for γ. For each set of values, ∏ is obtained and the log L computed.

Given that the model is highly nonlinear in its parameters, several local optima might be present. Hence, it is recommended to try different starting values for {Ω_R, Ω_E} and then explore the robustness of the estimates to different values of γ. To ensure positive definiteness of the variance-covariance matrices, we focus on the alternative vector of parameters , where chol implements a Cholesky decomposition.

The construction of confidence intervals for the parameter estimates is complicated by the nonlinear structure of the problem. We compute them by appealing to a Markov Chain Monte Carlo (MCMC) algorithm developed by Chernozhukov and Hong (2003) (CH hereafter). This method delivers both a global optimum and densities for the parameter estimates.

CH estimation is implemented via a Metropolis-Hastings algorithm. Given a starting value Ψ⁽⁰⁾, the procedure constructs chains of length N of the parameters of our model following these steps:

Step 1. Draw a candidate vector of parameter values Θ⁽ⁿ⁾ = Ψ⁽ⁿ⁾ + ψ⁽ⁿ⁾ for the chain's n + 1 state, where Ψ⁽ⁿ⁾ is the current state and Ψ⁽ⁿ⁾ is the vector of iid shocks drawn from N(0,Ω_ψ), and Ω_ψ is a diagonal matrix.

Step 2. Set the n + 1 state of the chain Ψ^{(n + 1)} = Θ⁽ⁿ⁾ with probability where L(Θ⁽ⁿ⁾) is the value of the likelihood function conditional on the candidate vector of parameter values, and L(Ψ⁽ⁿ⁾) the value of the likelihood function conditional on the current state of the chain. Otherwise, set Ψ^{(n + 1)} = Ψ⁽ⁿ⁾.

The starting value Θ⁽⁰⁾ is computed by working with a second-order Taylor approximation of the model (1)–(4) (see the main text), so that the model can be written as regressing X_t on lags of X_t, X_tz_t and . The residuals from this regression are employed to fit the expression for the reduced-form time-varying variance-covariance matrix of the VAR using maximum likelihood to estimate Ω_R and Ω_E. Conditional on these estimates and given a calibration for γ, we can construct Ω_t. Conditional on Ω_t, we can get starting values for ∏_R(L) and ∏_E(L).

Given a calibration for the initial (diagonal matrix) Ω_Ψ, a scale factor is adjusted to generate an acceptance rate close to 0.3, a typical choice for this kind of simulation (Canova 2007). We employ N 50,000 draws for our estimates, and retain the last 20 per cent for inference. Checks performed with N = 200,000 draws delivered very similar results.

As shown by CH, is a consistent estimate of Ψ under regularity assumptions on maximum likelihood estimators. Moreover, the covariance matrix of Ψ is given by , that is the variance of the estimates in the generated chain.

A.3 Generalised Impulse Response Functions

We compute the generalised impulse response functions from our STVAR model by following the approach proposed by Koop et al (1996). The algorithm features the following steps.

Consider the entire available observations, with sample size t = 1962:M7,…, 2008:M6, with T = 552, and construct the set of all possible histories Λ of length p = 12:{λ_i ∈ Λ}. Λ will contain T − p + 1 histories λ_i.^[24]
Separate the set of all recessionary histories from that of all expansionary histories. For each λ_i calculate the transition variable . If = −1.01 per cent, then λ_i ∈ Λ^R, where Λ^R is the set of all recessionary histories. If = −1.01 per cent, then λ_i ∈ Λ^E, where Λ^E is the set of all expansionary histories.
Select at random one history λ_i ∈ Λ^R. For the selected history, take obtained as:

View MathML

where and are obtained from the generated MCMC chain of parameter values during the estimation phase.^[25] is the transition variable calculated for the selected history λ_i.
Cholesky-decompose the estimated variance-covariance matrix :

View MathML

and orthogonalise the estimated residuals to get the structural shocks:

View MathML
From draw with replacement h eight-dimensional shocks and get the vector of bootstrapped shocks

View MathML

where h is the horizon for the IRFs we are interested in.
Form another set of bootstrapped shocks that will be equal to Equation (A1) except for the k_th shock in , which is the shock we want to perturb by an amount equal to δ. Denote the vector of bootstrapped perturbed shocks by .
Transform back and as follows:

View MathML

and

View MathML
Use Equations (A2) and (A3) to simulate the evolution of and and construct GIRF^(j)(h, δ, λ_i) as .
Conditional on history λ_i, repeat for j = 1,…,B vectors of bootstrapped residuals and get GIRF⁽¹⁾ (h, δ, λ_i), GIRF⁽²⁾ (h, δ, λ_i),…,GIRF^(B) (h, δ, λ_i). Set B = 500.
Calculate the GIRF conditional on history λ_i as:

View MathML
Repeat all previous steps for i = 1,…, 500 histories belonging to the set of recessionary histories,.
Take the average and get (h, δ, Λ^R), which is the average GIRF under recessions.
Repeat steps 3 to 12 for 500 histories belonging to the set of all expansions and get (h, δ, Λ^E).
The computation of the 68 per cent confidence bands for our impulse responses is undertaken by picking up, per each horizon of each state, the 16th and the 84th percentile of the densities and .

Footnotes

This section heavily draws on Auerbach and Gorodnichenko's (2012) ‘Appendix: Estimation Procedure’. [23]

The choice p = 12 is due to the number of moving average terms (12) of our transition variable, z_t. [24]

We consider the distribution of parameters rather than their mean values to allow for parameter uncertainty, as suggested by Koop et al (1996). [25]