Note to Reinhart/Rogoff (et. al): The Cause Usually Precedes the Effect

Or: Thinking About Periods and Lags

No need to rehash this cock-up, except to point to the utterly definitive takedown by Arindrajit Dube over at Next New Deal (hat tip: Krugman), and to point out that the takedown might just take even if you’re looking at R&R’s original, skewed data.

But a larger point: I frequently see econometrics like R&R’s, comparing Year t to Year and suggesting — usually only implicitly or with ever so many caveats and disqualifiers — that it demonstrates some kind of causation. I.e. GDP growth in 1989 vs. debt in 1989, ’90 vs. ’90, etc.

Haven’t they heard of looking at lags, and at multiple lags and periods? It’s the most elementary and obvious method (though obviously not definitive or dispositive) for trying to tease out causation. Because cause really does almost always precede effect. Time doesn’t run backwards. (Unless you believe, like many economists, that people, populations: 1. form both confident and accurate expectations about future macro variables, 2. fully understand the present implications of those expectations, and 3. act “rationally” — as a Platonic economist would — based on that understanding.)

By this standard of propter hoc analysis, R&R’s paper shows less analytical rigor than many posts by amateur internet econocranks. (Oui, comme moi.) This is a paper by top Harvard economists, and they didn’t use the most elementary analytical techniques used by real growth econometricians, and even by rank amateurs who are doing their first tentative stabs at understanding the data out there.

Here’s one example looking at multiple periods and multiple lags, comparing European growth to U.S. growth (click for larger).

This doesn’t show the correlations between growth and various imagined causes for the periods (tax levels, debt levels, etc.) — just the difference, EU vs. US, in real annualized growth. You have to do the correlations in your head, knowing, for instance, that the U.S. over this period taxed about 28% of GDP, while European countries taxed 30–50%, averaging about 40%.

But it does show the way to analyzing those correlations (and possible causalities), by looking at multiple periods and multiple lags. (I’d love to see multiple tables like this populated with correlation coefficients for different “causes.”)

Dube tackles the lag issue for the R&R sample beautifully in his analysis. In particular, he looks at both positive and negative lags. So, where do we see more correlation:

A. between last year’s growth and this year’s debt, or

B. between last year’s debt and this year’s growth?

The answer is B:

Figure 2:  Future and Past Growth Rates and Current Debt-to-GDP Ratio

(Also: if there’s any breakpoint for the growth effects of government debt, as suggested by R&R, it’s way below 90% of GDP. More like 30%.) See Dube’s addendum for a different version of these graphs, using another method to incorporate multiple lags.

Here’s what I’d really like to see: analysis like Dube’s using as its inputs many tables like the one above, each populated with correlations for a different presumed cause (“instrumental variable”). Combine that with Xavier Sala-i-Martin’s technique in his paper, “I just ran four million regressions“.

That paper looks at fifty-nine different possible causes of growth/instrumental variables (not including government debt/GDP ratio) in every possible combination, to figure out which ones might deliver robust correlations. I’m suggesting combining that with multiple periods and lags for each instrumental variable. IOW, “I just ran 4.2 billion regressions.” Not sure if we’ve got the horsepower yet, but…

Cross-posted at Asymptosis.