经济学

The econometrics of Chapter 10 was built on a single quiet assumption: the observations are independent draws from a fixed population. Randomize a treatment, find an instrument, exploit a discontinuity, and the i.i.d. sampling that justifies the asymptotics takes care of the rest. Most of the data a macroeconomist or a finance researcher actually works with does not arrive that way. It arrives in order. Quarterly GDP, monthly inflation, the overnight policy rate, daily equity returns: each observation sits next to its neighbors in time, and what happened last quarter is part of what determines this one.

That ordering changes everything downstream. Serial dependence means consecutive observations carry overlapping information, so the effective sample is smaller than the count of data points suggests and the standard error formulas of cross-sectional OLS understate uncertainty. Worse, many economic series trend: they wander upward over decades with no fixed mean to return to. Run an ordinary regression on two such series and you can get a high $R^2$ and a decisive $t$-statistic linking quantities that have nothing to do with each other. The cross-sectional toolkit does not merely lose efficiency on ordered data; on trending data it manufactures relationships that are not there.

A different apparatus is needed, organized around one question (what kind of process generated this series?) and a second, multivariate question once several series are in play: which shocks moved the system, and can we even say? The answer to the first builds from stationarity through ARMA models to unit-root tests. The answer to the second builds from the vector autoregression to its structural interpretation, where the chapter meets its sharpest problem: the data describe how variables move together but never say which one moved first.

学完本章后，你将能够：

Define covariance stationarity and explain why it is the precondition for standard time-series inference
Identify AR and MA processes from their autocorrelation and partial-autocorrelation signatures and forecast from an ARMA model
Distinguish difference-stationary from trend-stationary series, recognize a unit root, and explain the spurious-regression trap
Specify a reduced-form VAR, read a Granger-causality result, and interpret an impulse response and a variance decomposition
Explain why a reduced-form VAR is not structural and how an identifying restriction (Cholesky ordering, sign restrictions) changes the answer
State what cointegration is, write the error-correction model that corresponds to a cointegrating relationship, and interpret the speed of adjustment
Recognize volatility clustering and explain how ARCH/GARCH models make conditional variance forecastable

Prerequisites: the identification frame, OLS, and the serial-correlation note of Chapter 10 (Econometrics Foundations); linear algebra (the VAR is matrix-valued); basic probability. The stochastic-process intuition is built here, not assumed.

The methods below were not always part of economics. Through the 1970s the discipline learned to take the time dimension seriously, largely through the work of Clive Granger and Robert Engle on long-run relationships and volatility and Christopher Sims on multivariate dynamics; their program reshaped how macroeconomists handle data. That intellectual lineage, who arrived at these ideas and against what, belongs to the history-of-economic-thought volume rather than here, and is traced in its chapter on the information-economics and game-theory era.

23.1 Stationarity and the Building Blocks

Before any model can be fit, the series has to be the kind of object a model can describe. The organizing idea is stationarity. A stationary process is one whose probabilistic character is stable over time: the same mechanism is generating the data at the start of the sample as at the end. If that is true, the past is informative about the future in a way that does not itself shift; if it is not, the parameters being estimated are aimed at a moving target.

The lag operator turns dynamics into algebra. A polynomial $\phi(L)=1-\phi_1 L-\cdots-\phi_p L^p$ acting on $y_t$ encodes a whole autoregression in one symbol, and the process is stationary exactly when the roots of $\phi(z)=0$ lie outside the unit circle. When a root sits on the unit circle, stationarity fails; that is the unit root of §23.3.

Here $\{\varepsilon_t\}$ is white noise, the $\psi_j$ are square-summable weights with $\psi_0=1$, and $\kappa_t$ is the part of $y_t$ that is perfectly predictable from its own past (a deterministic trend or seasonal). The square-summability of the weights is what keeps the variance finite, and it is the formal content of "the influence of an old shock fades."

直觉模式

为什么这很重要： A stationary process is one whose statistical character doesn't drift. Slide a window along the series and the picture inside it looks the same. The mean you would estimate from the first decade matches the mean from the last; the size of the typical wiggle is the same throughout; how strongly today relates to last month doesn't depend on which month. White noise is the purest case: pure surprise, no memory. And Wold's result says something reassuring about everything else, because every well-behaved series is just a stream of those surprises, with old ones fading as new ones arrive. The figure below lets you watch a series stay tethered to its mean, and lets you push it toward the edge where the tether snaps.

Once a series is recognized as stationary, the next question is how to model it. Two primitive ways a series can remember its past combine into a single family.

23.2 ARMA Processes

A stationary series remembers, and there are two clean ways to write down what it remembers. It can carry forward its own past values, so that high output last quarter raises expected output this quarter. Or it can carry forward the echoes of past surprises, a shock to the system that takes a few periods to work through. The first is autoregression; the second is moving average. Combine them and you have the workhorse model of univariate time series.

The ACF and PACF are the diagnostic instruments of the Box-Jenkins model-building approach, and the rule that makes them useful is a contrast in how they fall off. A pure AR process has a PACF that cuts off sharply after lag $p$ (once you control for the first $p$ lags, nothing further adds predictive content) while its ACF decays gradually. A pure MA process is the mirror image: its ACF cuts off after lag $q$, while its PACF decays. The cut-off-versus-decay pattern is how a practitioner reads the order of a model off the data.

Forecasting from an ARMA model follows from the lag-operator algebra. The one-step-ahead forecast sets future innovations to their expected value of zero and rolls the estimated coefficients forward; the multi-step forecast iterates that recursion, and because the process is stationary the forecast converges to the unconditional mean $\mu$ as the horizon grows, with the forecast-error variance rising to the unconditional variance.

直觉模式

为什么这很重要： An AR series carries its own past forward: today's level is a faded copy of yesterday's. An MA series carries the echoes of past surprises, a shock that rings for a few periods and then is gone. They leave different fingerprints, and the two diagnostic plots read those fingerprints: for an AR process the partial-autocorrelation plot drops to zero abruptly while the plain autocorrelation tails off; for an MA process it is the other way around. You are not learning a formula so much as learning to recognize two shapes. Forecasting then needs no new idea. With no fresh surprise to expect, the best guess for the far future is just the long-run average, and the model tells you how fast you get there. Set the sliders below to a pure AR and watch one plot snap to zero; switch to a pure MA and watch the other one do it.

All of this assumed the series was stationary. The diagnostic plots, the forecasts, the very meaning of the coefficients: each rests on a mean to return to and a finite variance. What happens when the series has neither?

23.3 Unit Roots and Spurious Regression

Push the AR(1) coefficient all the way to one and the machinery of §23.2 breaks. The series no longer has a mean to return to. Figure 23.1, dragged across $\phi = 1$, showed it directly: below one the series is tethered and a shock decays; at exactly one the tether snaps and a shock becomes permanent. That boundary case is the random walk, the most important non-stationary process in economics because so many macro series behave like it.

The reason a unit root cannot be ignored is the trap it sets for ordinary regression. Take two random walks generated completely independently of each other, with no causal link, no common driver, nothing. Regress one on the other and, far more often than chance should allow, you will find a large $R^2$ and a $t$-statistic that comfortably clears any conventional threshold. The regression announces a strong relationship between series that share nothing. This is the spurious-regression result that Granger and Newbold warned of in 1974, and it overturns the instinct that high $R^2$ plus a significant $t$ means something real.

Under $H_0:\gamma=0$ the level $y_{t-1}$ drops out and $\Delta y_t$ is driven only by its own lagged differences and noise: a unit root. Rejecting $H_0$ in favor of $\gamma<0$ means a deviation from the level is partly pulled back, i.e. the series is stationary. The lagged differences are the "augmentation" that absorbs serial correlation in $\varepsilon_t$ so the test is valid; the number of lags $k$ is chosen by an information criterion.

直觉模式

为什么这很重要： A random walk has no gravity. A stationary series is held on a tether, so pull it away from its center and it springs back; a random walk has nothing pulling it home, so a shock today never washes out and the series just wanders wherever the accumulated shocks take it. That is what a unit root means: permanence instead of decay. And here is the trap. Two wanderers, set loose independently, will both drift somewhere over a long sample, and any two things that drift will look correlated, because a line through the cloud always slopes one way or the other. Your regression will report a strong relationship between two series that have never met. The fix is to ask the right question first, whether this series is tethered or wandering, using a formal unit-root test, and to study wanderers in their changes (differences) rather than their levels. The figure below makes the trap visceral: regenerate the two independent walks and watch a "significant" relationship keep reappearing out of nothing.

A standing example. Is US real GDP a random walk? Fit an AR(1) to its logarithm and the estimated coefficient comes back very close to one; run an ADF test and it typically fails to reject the unit-root null. The practical upshot is that output is better modeled in growth rates (first differences) than in levels. The historical episode this draws on is the province of the economic-history volume, but the apparatus is the point here.

So far one series at a time. Macroeconomic questions are usually about several series at once, such as output and inflation, or the interest rate and the exchange rate, moving together. The single-equation tools generalize into a system.

23.4 Vector Autoregressions (VAR)

Most interesting macroeconomic questions involve more than one variable. Inflation and the policy rate respond to each other; output, prices, and money move as a system. The vector autoregression is the natural generalization of the AR model to this setting: stack the variables into a vector and let each variable depend on the recent past of all of them.

Because every equation has the same regressors, the lags of all variables, OLS applied equation by equation is efficient, and no system estimator is required for the reduced form. The estimated $A_i$ matrices and the residual covariance $\Sigma$ summarize the dynamics.

A reduced-form VAR delivers two things almost for free. It forecasts the whole system jointly, often better than a structural model because it imposes few restrictions; and it tests Granger causality, telling you which series carry predictive content for which others. The impulse response and the variance decomposition seem to promise a third thing: a story about what happens after a shock. But there is a catch the prose has been signposting. The innovations $\mathbf{u}_t$ are correlated across equations. A movement in the inflation residual tends to come alongside a movement in the rate residual. So when the system "responds to a shock," whose shock was it? The reduced form cannot say.

直觉模式

为什么这很重要： A VAR lets every variable depend on the recent past of all the others. That buys two things immediately: a joint forecast of the whole system, and a test of which series helps predict which, called "Granger causality." But notice the careful word: helps predict is not causes. Knowing that ice-cream sales help predict drownings does not mean ice cream causes drowning; summer drives both. A VAR can tell you the policy rate helps predict inflation, and that is genuinely useful, but it stops short of saying the rate moved inflation. The reason it stops short is that the surprises in the different equations arrive together, tangled with one another, so "the system's response to a shock" is ambiguous until you untangle which shock you mean. That untangling is the next section, and it is the hardest problem in the chapter.

The impulse responses you just generated assumed an ordering, a choice about which variable can move the other within the period. That choice was made quietly. Bringing it into the open is the structural step, and it is where the data stop being able to settle the argument.

23.5 Structural VAR and Identification

The reduced-form residuals are correlated, and that correlation is the whole problem. An economic shock, a surprise tightening of monetary policy, say, should be a single disturbance with a clear interpretation. But the reduced-form innovation in the rate equation is contaminated by whatever moved inflation at the same instant, and vice versa. The residuals are mixtures. To recover the underlying structural shocks, you have to specify how the mixtures were formed, and the data do not contain that information.

For a two-variable system, $\Sigma=BB'$ supplies three equations (two variances and one covariance) for the four elements of $B$. One restriction is missing. Imposing it is identification, and the impulse responses at horizon $h$, given by $\Theta_h = A^h B$, inherit whatever was imposed.

Three families of restriction are in common use, and Figure 23.4 lets you feel the consequence of the most common one. Toggle the Cholesky ordering between "rate first" and "inflation first" and the impulse responses visibly change shape. Nothing in the data changed: the same residuals, the same estimated dynamics. What changed is the assumption about which variable can move the other within the period, and that assumption rewrote the economic story. Short-run zero restrictions like Cholesky are one option; long-run restrictions, which assume a shock has no permanent effect on some variable (Blanchard and Quah's decomposition of demand and supply shocks is the standard example), are another, named here but not derived; sign restrictions are the modern alternative that reports the identified set rather than a single line.

This is where the chapter's framing tension lives. Christopher Sims, who introduced the VAR to macroeconomics, argued that the elaborate identifying restrictions of older structural models were "incredible," assumptions imposed for tractability rather than belief, and that an honest empirical macro should let the data speak with as few restrictions as possible. The opposing view holds that without some economic structure the impulse responses are uninterpretable, so the right move is to impose restrictions you can defend and be explicit about them. Both positions are serious, and the toggle you just used is exactly what they disagree about: how much identification the data can honestly support, and what to do about the gap.

Which side has the better of the argument is not settled here. The apparatus (what identification is, why the choice matters, what each scheme buys and gives up) is the chapter's job, and it is now in hand. The verdict over whether atheoretical VARs or structural identification won the field is argued at length in the walkthrough on econometric-methodology credibility, where the methodological stakes get the space they need.

直觉模式

为什么这很重要： The data hand you correlated shocks: the surprises in inflation and in the interest rate arrive tangled together. To tell an economic story you have to say which way the arrow points within the same period: did the rate move first and inflation follow, or the reverse? Economics has to supply that arrow, because the data cannot. They only show the two moving together, not the order. And here is the uncomfortable part: that single added assumption, the one the data can't check, decides the whole answer. In the figure, flipping the ordering kept every number in the dataset identical and still flipped the story. Sims's worry was that economists were smuggling in arrows they couldn't justify and calling the result a finding. The other camp says you can't avoid choosing an arrow, so choose one you can defend and say so out loud. The honest takeaway is the tension itself, which is why the verdict is argued in a walkthrough, not pronounced here.

Did the data ever speak for themselves?

Apparatus stop. The VAR/SVAR identification problem you just met is the technical core of the methodological-credibility debate; this is where the walkthrough gets its machinery.

模型的解释

The Cholesky-ordering toggle in Figure 23.4 is the whole methodological argument in one gesture: identical data, a different identifying assumption, a different impulse response. Sims's atheoretical-VAR program was a bid to minimize such untestable assumptions; the structural-identification reply is that some assumption is unavoidable and the discipline should make defensible ones explicitly. The reduced-form VAR (§23.4), the $\mathbf{u}_t = B\boldsymbol{\varepsilon}_t$ mapping (§23.5), and the menu of restrictions (Cholesky, long-run, sign) are the apparatus on which that argument is conducted.

判断（在当前水平）

This chapter teaches what identification is and shows that the choice changes the answer; it deliberately stops short of declaring who won. The walkthrough takes the apparatus and argues the verdict: whether the credibility revolution vindicated the design-based skeptics, whether structural macro earned its restrictions back, and where time-series identification sits in that story.

Apparatus stop — VAR/SVAR identification

The systems so far have been built from stationary or differenced series. But differencing can throw away something real: when two series wander together, taking their changes erases the relationship that ties them. The next section recovers it.

23.6 Cointegration and Error Correction

Recall the spurious-regression warning: two independent wanderers look related, and the fix is to study them in differences. But sometimes two wanderers are genuinely tethered, not to a fixed mean each, but to each other. Short-term and long-term interest rates drift over the decades, yet the spread between them stays within a band. Consumption and income each trend upward, yet the ratio is stable. Differencing such a pair would difference away exactly the long-run relationship that matters. Cointegration is the apparatus for the case where the relationship is real.

The error-correction coefficient is the most interpretable number in this section. If $\lambda = -0.2$, then a fifth of any deviation from the long-run relationship is undone each period; if $\lambda = 0$, there is no pull and the series are not cointegrated at all, so the "relationship" is the spurious kind from §23.3. The Granger representation theorem is what makes this respectable: it guarantees that whenever genuine cointegration is present, an error-correction model is the right way to write the dynamics, so estimating the ECM is not an ad hoc add-on but the implied form. The Johansen procedure extends the idea to systems with possibly several equilibrium relations, and its rank test answers how many; the mechanics of the eigenvalue computation are beyond the scope here, but the meaning of the rank, the count of long-run ties, is the part to carry forward.

直觉模式

为什么这很重要： Two series can each wander forever and yet never drift apart from each other, like two dogs off the leash but tied together by a short rope. Each goes where it likes; the rope keeps the distance between them from growing. That shared distance is the long-run equilibrium, and the speed at which the rope yanks them back when they stray is the error-correction term. This is the exact opposite of the spurious-regression trap: there, two wanderers only looked related; here, they truly are, and the rope is real. Earlier the advice was to difference wanderers before studying them, but if you difference a tethered pair you cut the rope and lose the very thing you wanted to see. The figure below has the rope's tightness as a slider: at zero the two series float apart freely, and as you tighten it the gap between them settles into a stationary band even while each series keeps wandering.

Everything so far has modeled the conditional mean, where the series is expected to go. A final movement turns to the conditional variance, how uncertain that expectation is, and finds that uncertainty itself has a predictable rhythm.

23.7 ARCH / GARCH Volatility Modeling

Look at a long series of daily stock returns and one feature jumps out before any model is fit: the turbulence comes in clumps. Quiet stretches of small moves are interrupted by stormy stretches of large moves, and the storms persist for days or weeks before calm returns. The returns themselves are close to unpredictable, as efficient markets would lead you to expect, but their size is not. Large moves cluster with large moves. This is the stylized fact that the conditional-variance models were built to capture, and it sits orthogonal to everything in §23.1 through §23.6, which modeled the conditional mean.

In the GARCH(1,1), the sum $\alpha_1+\beta_1$ measures persistence: it governs how slowly a volatility shock decays, and the unconditional variance $\omega/(1-\alpha_1-\beta_1)$ is finite only when $\alpha_1+\beta_1<1$. Estimated persistence on daily equity data is routinely above $0.95$, meaning a spike in volatility takes a long time to subside: the empirical content of "turbulence lingers."

直觉模式

为什么这很重要： You cannot predict whether tomorrow's return is up or down; if you could, the trade would already be made and the prediction erased. But you can predict whether tomorrow will be a calm day or a wild one, because calm and wild cluster: a stormy market today tells you the next few days are likely to be stormy too, even though it tells you nothing about which direction the moves will go. That split, where the direction of a surprise is unforecastable but the size of a surprise is forecastable, is the entire idea behind these models. A big move today feeds into a bigger expected swing tomorrow; when that feedback is strong, storms last a long time once they start. The figure below has a persistence dial: turn it low and the series looks like featureless noise; turn it toward one and watch calm and turbulent stretches organize themselves into long swells, no equation required to see it happen.

The basic GARCH(1,1) has spawned a family of refinements, each fixing a limitation. EGARCH models the logarithm of the variance, so it allows the "leverage effect," where bad news raises volatility more than equally large good news, and needs no non-negativity constraints. GARCH-in-mean (GARCH-M) lets the conditional variance enter the return equation directly, formalizing the idea that investors demand higher expected returns when risk is high. These extensions are named here as the working vocabulary of empirical finance, not derived. Beyond them lies a frontier (Bayesian estimation of high-dimensional VARs, machine-learning approaches to volatility and prediction) that is outside the scope of a first course but worth knowing exists.

结论

Stationarity is the precondition. Standard time-series inference assumes covariance stationarity: a stable mean, variance, and autocovariance structure. The lag operator compacts the algebra; the Wold decomposition guarantees every stationary series is a filtered stream of white-noise shocks.
ARMA models and their fingerprints. AR processes carry their own past forward; MA processes carry the echoes of past shocks. The ACF and PACF diagnose them (PACF cuts off for AR, ACF cuts off for MA), which is the Box-Jenkins identification rule.
Unit roots break the toolkit. A random walk has a unit root: shocks are permanent, the series wanders, and two independent wanderers regress spuriously (high $R^2$, significant $t$). The ADF test detects a unit root; differencing is the fix.
The VAR models the system. Each variable on lags of all variables gives a joint forecast and a Granger-causality test, but Granger causality is predictive, not structural, and the reduced-form shocks are correlated.
Structural identification is the pivot. Recovering economic shocks from reduced-form residuals needs a restriction the data cannot supply. Cholesky ordering, long-run restrictions, and sign restrictions each impose one, and the choice changes the impulse response. This is the Sims atheoretical-VAR tension.
Cointegration recovers long-run relationships. I(1) series can share a stationary linear combination, a long-run equilibrium that differencing would destroy. The error-correction model measures the speed of return, and the Granger representation theorem makes cointegration and ECM equivalent.
ARCH/GARCH model time-varying risk. Volatility clusters: the size of returns is forecastable even when their direction is not. GARCH makes conditional variance depend on past squared shocks and past variance; persistence near 1 means turbulence lingers.

关键公式

标签	方程	描述
Eq. 23.1	$E[y_t]=\mu,\ \mathrm{Var}(y_t)=\sigma^2,\ \mathrm{Cov}(y_t,y_{t-k})=\gamma_k$	Covariance-stationarity conditions
Eq. 23.2	$y_t = \sum_{j=0}^{\infty}\psi_j\varepsilon_{t-j} + \kappa_t$	Wold decomposition
Eq. 23.3	$y_t = c + \phi_1 y_{t-1} + \cdots + \phi_p y_{t-p} + \varepsilon_t$	AR(p)
Eq. 23.4	$y_t = \mu + \varepsilon_t + \theta_1\varepsilon_{t-1} + \cdots + \theta_q\varepsilon_{t-q}$	MA(q)
Eq. 23.5	$\phi(L)y_t = \theta(L)\varepsilon_t$	ARMA in lag-operator form
Eq. 23.6	$y_t = y_{t-1} + \varepsilon_t$ (with drift $+\,\delta$)	Random walk
Eq. 23.7	$\Delta y_t = \alpha + \gamma y_{t-1} + \sum\delta_i\Delta y_{t-i} + \varepsilon_t;\ H_0:\gamma=0$	Augmented Dickey-Fuller regression
Eq. 23.8	$\mathbf{y}_t = \mathbf{c} + A_1\mathbf{y}_{t-1} + \cdots + A_p\mathbf{y}_{t-p} + \mathbf{u}_t$	Reduced-form VAR(p)
Eq. 23.9	$\mathbf{u}_t = B\boldsymbol{\varepsilon}_t,\ E[\boldsymbol{\varepsilon}_t\boldsymbol{\varepsilon}_t']=I$	SVAR mapping (residuals to structural shocks)
Eq. 23.10	$\Theta_h = A^h B$	Structural impulse response at horizon $h$
Eq. 23.11	$z_t = y_t - \beta x_t \sim I(0)$ when $y_t, x_t \sim I(1)$	Cointegrating relation
Eq. 23.12	$\Delta y_t = \lambda(y_{t-1}-\beta x_{t-1}) + \cdots + \varepsilon_t$	Error-correction model
Eq. 23.13	$\sigma_t^2 = \omega + \sum_{i=1}^{q}\alpha_i\varepsilon_{t-i}^2$	ARCH(q) conditional variance
Eq. 23.14	$\sigma_t^2 = \omega + \sum\alpha_i\varepsilon_{t-i}^2 + \sum\beta_j\sigma_{t-j}^2$	GARCH(p,q) conditional variance

基础练习

You are shown an ACF that decays geometrically and a PACF that spikes at lag 1 and is essentially zero thereafter. Identify the process (AR, MA, or ARMA) and its order, and state the Box-Jenkins rule you used.
For each series, classify it as AR or MA from its diagnostic plots: (a) ACF cuts off after lag 2, PACF decays; (b) PACF cuts off after lag 1, ACF decays. State the order in each case.
A series must be differenced twice before its ACF and an ADF test indicate stationarity; its first difference still shows a near-unit-root ACF. State its order of integration and justify the classification.
A Granger-causality table reports that the policy rate Granger-causes inflation ($p<0.01$) but inflation does not Granger-cause the rate ($p=0.4$). Interpret the result, and state precisely what it does and does not establish about causation.

应用练习

Write down the ADF regression for testing whether log real GDP has a unit root. State the null hypothesis in terms of the coefficient on the lagged level, explain why ordinary $t$-critical values are inappropriate, and explain the spurious-regression risk that motivates running the test before regressing GDP on another trending series.
Specify a two-variable VAR(1) in output growth and inflation. Write the two equations explicitly, state what regressors each contains, and explain in words what an impulse response of inflation to an output-growth shock would show.
In the monetary VAR of Figure 23.4, explain why the Cholesky ordering matters. State the contemporaneous restriction each ordering imposes, and explain why reordering the variables changes the impulse responses even though the estimated reduced form is unchanged.
A GARCH(1,1) fit to daily returns gives $\alpha_1 = 0.08$, $\beta_1 = 0.90$. Compute the persistence, state whether the unconditional variance is finite, and interpret what the persistence value implies about how long a volatility spike lasts.

挑战题

Using the lag operator, derive the MA($\infty$) representation of a stationary AR(1), $y_t = \phi y_{t-1} + \varepsilon_t$ with $|\phi|<1$. Show that the moving-average weights are $\psi_j = \phi^j$, and explain what the condition $|\phi|<1$ guarantees about the weights and the variance.
Sketch the Granger-Newbold logic for why OLS on two independent random walks produces spurious significance. Explain why the regression residuals are non-stationary under the null of no relationship, and why that invalidates the usual $t$-distribution for the slope coefficient.
State the Granger representation theorem's equivalence (cointegration if and only if an ECM representation exists) in words. Then, given a cointegrating vector implying the long-run relationship $y = 1.5\,x$, write the error-correction model for $\Delta y_t$, label the error-correction term, and interpret the sign and magnitude of its coefficient.
Contrast Sims's skepticism about structural identification with a defensible structural-identification scheme (e.g. a Cholesky ordering justified by an assumption that monetary policy responds to inflation only with a lag). State what each approach buys and what each gives up, and explain why the question of which is right is one this material poses rather than answers.

Chapter 23Time-Series Econometrics

引言

23.1 Stationarity and the Building Blocks

23.2 ARMA Processes

23.3 Unit Roots and Spurious Regression

23.4 Vector Autoregressions (VAR)

23.5 Structural VAR and Identification

Did the data ever speak for themselves?

模型的解释

判断（在当前水平）

Did the data ever speak for themselves?

23.6 Cointegration and Error Correction

23.7 ARCH / GARCH Volatility Modeling

结论

关键公式

基础练习

应用练习

挑战题

Sources