How Many Equations Should There be in Macroeconomic Models ?

Recently a very old debate among macroeconomists has been reopened (this happens from time to time). Paul Romer decided to discuss a key conference held in 1978 (yes really). Some (including me) think that’s about when the profession took a wrong turn largely following Robert Lucas. But in the discussion until about yesterday, it was agreed that macroeconomics was in a bad way in 1978 and needed to change. Romer particularly criticized a paper presented by Ray Fair at the conference.

This has provoked Ray Fair* to start blogging. I think it is quite important to read his post (so does Mark Thoma). Fair is very unusual, because he works at a University (some small place called Yale) yet he stuck with the approach started by Jan Tinbergen and especially by Jacob Marschak and collegues at the Cowles Commission (then at U Chicago) which was criticized by Lucas. I will follow Fair by calling it the CC (for Cowles Commission) approach. Notably, the approach was never abandoned by working macroeconomists, including those at the Fed and those who sell forecasts to clients who care about forecast accuracy not microfoundations.

Insert: This post is long. The punchline is that I think that a promising approach would be to combine CC models with a pseudo prior that a good model is not too far from a standard DSGE model. This is the sort of thing done with high dimensional VARs using the so called Minnesota prior.
end insert.

There are (at least) two key differences between the CC approach and models developed later. First the old CC models did not assume rational expectations. This has been the focus of the discussion especially as viewed by outsiders. But another difference is that the old models including many more variables and, therefore, many more equations, than the newer ones. The model presented in 1978 had 97 equations. This post is about the second diference — I don’t believe it makes sense to assume rational expectations, but I won’t discuss that issue at all.

With his usual extreme courtesy, Simon Wren Lewis noted advantages of the old approach and, as always, argues both old and newer approaches are valuable and should be explored in parallel.

I have to admit that I don’t intend to ever work with a model with 97 separate equations (meaning 97 dependent variables). But I think that one fatal defect of current academic macroeconomics is that it has been decided to keep the number of equations down to roughly 7 (New Keynesian) or fewer (RBC).

I will start by discussing the costs of such parsimony.

1) One feature of pre 2008 DSGE models which, it is agreed just won’t do is that they assumed there was only one interest rate. In fact there are thousands. The difference between the return on Treasury bills and junk corporate bonds was one of the details which was ignored. The professions response to 2008 has been to focus on risk premia and how they change (without necessarily insisting on an explanation which has anything to do with firm level micro data). Here I think it is agreed that the pre 2008 approach was a very bad mistake.

2) As far as I know (and I don’t know as much as I should) a second omission has received much less attention. Standard DSGE models still contain no housing sector. So the profession is attempting to understand the great recession while ignoring housing completely. Here, in particular, the old view that monetary policy affects output principally through residential investment isn’t so much rejected as ignored (and quite possibly forgotten).

3) Similarly there are no inventories in models which aim to match patterns in quarterly data. I teach using “Advanced Macroeconomics” by Romer (David not Paul or Christine). He notes that a major component of the variance in detrended (or HP filtered) output is variance in detrended inventory investment, then writes no more on the topic. He is about as far from Lucas as an academic macro-economist (other than Fair) can be. Assuming no inventories when trying to model the business cycle is crazy.

4) In standard models, there is one sector. There is no discussion of the distinction between goods and services (except now financial service) or between capital goods and consumption goods. In particular it is assumed that there are no systematic wage differentials such that a given worker would be pleased to move from the fast food sector to the automobile manufacturing sector. Again the micro-econometric research is completely ignored.

5) A lot of standard academic DSGE models assume a closed economy.

6) No one thinks that the wage and price setting mechanisms assumed in either RBC or NK models are realistic. They are defended as convenient short cuts.

7) It is assumed that there are no hiring or firing costs (or unions which object to layoffs). Similarly the assumptions about costs of adjusting capital are not ones that anyone considered until it was necessary to make them to reconcile the data with the assumption that managers act only to maximize shareholder value.

8) Oh yes it is assumed that there are no principal agent problems in firms.

9) It is assumed that markets are complete even though they obviously aren’t and general equilibrium theorists know the assumption is absolutely key to standard results.

10) It is assumed that there is a representative agent even though there obviously isn’t and general equilibrium theorists know the assumption makes a huge gigantic difference.

This means that most of the topics which were the focus of old business cycle reasearch are ignored as are most post 1973 developments in microeconomics.

Before going on, I have to note that when each of these assumptions is criticized, special purpose models which relax the extreme assumptions are mentioned (sometimes they are developed after the criticism). But policy is discussed using the standard models. The assumptions are immune to evidence, because no one claims they are true yet their implications are taken very seriously.

What benefit could possibly be worth such choices ? That is what is wrong with a macroeconomic model with too many equations ? One problem is that complicated models are hard to understand and don’t clarify thought. This was once a strong argument, but it is not possible to intuitively grasp current DSGE models.

One reason to fear many equations is the experience of working with atheoretic vector autoregression (VAR) models which were developed in parallel with DSGE. in VARs the number of parameters to be estimated is proportional to the square of the number of equations. The number of observations of dependent variables is equal to the number of equations. More equations can imply more parameters than data points. Even short of that, large VAR models are over parametrized and fit excellently and forecast terribly. 7 equations are clearly too many. a 97 equation VAR just couldn’t be estimated. The CC approach relied on imposing many restrictions on the data based on common sense. A 97 equation DSGE model is, in pricipal, possible, but ideas about simplifying assumptions which should be made are, I think, based in large part on the assumptions which must be made to estimate a VAR.

If there are many dependent variables but each is explained by an ordinary number of independent variables each of which is instrumented by a credible instrument, then there shouldn’t be a problem with over-fitting. The fact that somewhere else in the model othere equations are estimated does not cause a spuriously good fit for an equation which doesn’t include too many parameters itself.

However, there is another cost of estimating a lot of parameters. The parameter estimation error makes forecasts worse at the same time it makes the in sample fit better. In the simplest cases, these two problems cause identical gaps between the in sample fit and the out of sample forecast. The second problem is absolutely not eliminated by making sure each equation is well identified.

But there is a standard approach to dealing with it. Instead of imposing a restriction that some parameter is zero, one can use a weighted average of the estimate parameter and zero. This is a Stein type pseudo Bayesian estimator.

I will give two examples. In the now standard approach, it is assumed that residential ivnestment is always exactly proportional to non residential investment. In the old approach residential and non residential investment were considered separately. In the pseudo Bayesian approach, one can estimate an equation for the growth of log total investment, estimate equation for the growth of log residential minus the growth of log total investment, then multiply the coefficients of the second equation by a constant less than one.

In another example one can assume that inventory investment is zero (as is standard DSGE models) or estimate net inventory investment as a function of other variables. Adding half the fitted net inventory investment to the standard DSGE model might give better forecasts than either the now fashionable or the old fashioned model.

This is the standard approach used with high dimensional VARs. I see no reason why it couldn’t be applied to CC models.

I see Wren Lewis has a new post which I must read before typing more (I have read it and type the same old same old so you probably don’t want to click “read more”).

In the new post Wren-Lewis says he largely agrees with Fair. He favors rational expectations and think that DSGE models with bells and whistles should be explored along with CC models. Those are the topics I am trying to avoid in this post. I note only that Wren-Lewis states this view but presents no evidence or argument to support it (and as far as I know never has presented anything beyond “very thin gruel”)

Wren Lewis also quoted a very clear expression of the opposing view

What about the claim that only internally consistent DSGE models can give reliable policy advice? For another project, I have been rereading an AEJ Macro paper written in 2008 by Chari et al, where they argue that New Keynesian models are not yet useful for policy analysis because they are not properly microfounded. They write “One tradition, which we prefer, is to keep the model very simple, keep the number of parameters small and well-motivated by micro facts, and put up with the reality that such a model neither can nor should fit most aspects of the data. Such a model can still be very useful in clarifying how to think about policy.” That is where you end up if you take a purist view about internal consistency, the Lucas critique and all that. It in essence amounts to the following approach: if I cannot understand something, it is best to assume it does not exist.

Here I note that “the Lucas critique” (which as noted by Lucas was presented by Marschak long before Lucas) does not imply that it is OK to use models which don’t fit most aspects of the data. Here the actually relevant word is “Lucas”. As usual, Noah Smith put it much better than I did, so you probably want to read his post not mine.

I also assert that existing DSGE models (both RBC and new Keynesian) are not motivated by “micro facts” and are, instead, inconsistent with micro data. I think the correct way of putting it is “consistent with the microeconomic assumptions which were standard in the 1970s.”

Update: To be more specific. I don’t think the parameters used in standard macro models are based on micro data at all. I don’t think that standard assumptions about the intertemporal elasticity of substitution of consumption are based on micro data — I think the assumed elasticity is much larger than any estimated with any data, because this is needed for the model to fit the macro time series. I am quite sure that the elasticity of substitution of consumption and leisure are not based on micro data. Data based estimates for men and for women are very different. The standard macro models have genderless agents who live alone (or in households in which there is no conflict or even negotiatoin). This model of the unit which supplies labor and consumes is not firmly based on micro data. Standard models imply short run elasticities of labor demand about 6 times as large as those estimated with micro data (at least 3 not at most 0.5).

I note different parameters for the same utility function are standard in growth theory and business cycle DSGE models. It can’t be that both are based on micro data.

I must admit again that I don’t keep up with the literature as much as I should, but I recall many macroeconomic papers which start with a semi-motivated prior then estimate parameters with macro data (say the first and last time I heard Prescott speak he said something along the lines of “parameters chosen to fit US business cycle data.” I thought the pretence that parameters were based on micro data had been abandoned long ago.

end update:

I have never seen any reason to hope that the approach favored by Chari et all is “useful” for thinking about policy. I think that the results of efforts to think about policy using this approach strongly support my initial belief that it was misguided.