Recently a very old debate among macroeconomists has been reopened (this happens from time to time). Paul Romer decided to discuss a key conference held in 1978 (yes really). Some (including me) think that’s about when the profession took a wrong turn largely following Robert Lucas. But in the discussion until about yesterday, it was agreed that macroeconomics was in a bad way in 1978 and needed to change. Romer particularly criticized a paper presented by Ray Fair at the conference.
This has provoked Ray Fair* to start blogging. I think it is quite important to read his post (so does Mark Thoma). Fair is very unusual, because he works at a University (some small place called Yale) yet he stuck with the approach started by Jan Tinbergen and especially by Jacob Marschak and collegues at the Cowles Commission (then at U Chicago) which was criticized by Lucas. I will follow Fair by calling it the CC (for Cowles Commission) approach. Notably, the approach was never abandoned by working macroeconomists, including those at the Fed and those who sell forecasts to clients who care about forecast accuracy not microfoundations.
Insert: This post is long. The punchline is that I think that a promising approach would be to combine CC models with a pseudo prior that a good model is not too far from a standard DSGE model. This is the sort of thing done with high dimensional VARs using the so called Minnesota prior.
There are (at least) two key differences between the CC approach and models developed later. First the old CC models did not assume rational expectations. This has been the focus of the discussion especially as viewed by outsiders. But another difference is that the old models including many more variables and, therefore, many more equations, than the newer ones. The model presented in 1978 had 97 equations. This post is about the second diference — I don’t believe it makes sense to assume rational expectations, but I won’t discuss that issue at all.
I have to admit that I don’t intend to ever work with a model with 97 separate equations (meaning 97 dependent variables). But I think that one fatal defect of current academic macroeconomics is that it has been decided to keep the number of equations down to roughly 7 (New Keynesian) or fewer (RBC).
I will start by discussing the costs of such parsimony.
1) One feature of pre 2008 DSGE models which, it is agreed just won’t do is that they assumed there was only one interest rate. In fact there are thousands. The difference between the return on Treasury bills and junk corporate bonds was one of the details which was ignored. The professions response to 2008 has been to focus on risk premia and how they change (without necessarily insisting on an explanation which has anything to do with firm level micro data). Here I think it is agreed that the pre 2008 approach was a very bad mistake.
2) As far as I know (and I don’t know as much as I should) a second omission has received much less attention. Standard DSGE models still contain no housing sector. So the profession is attempting to understand the great recession while ignoring housing completely. Here, in particular, the old view that monetary policy affects output principally through residential investment isn’t so much rejected as ignored (and quite possibly forgotten).
3) Similarly there are no inventories in models which aim to match patterns in quarterly data. I teach using “Advanced Macroeconomics” by Romer (David not Paul or Christine). He notes that a major component of the variance in detrended (or HP filtered) output is variance in detrended inventory investment, then writes no more on the topic. He is about as far from Lucas as an academic macro-economist (other than Fair) can be. Assuming no inventories when trying to model the business cycle is crazy.
4) In standard models, there is one sector. There is no discussion of the distinction between goods and services (except now financial service) or between capital goods and consumption goods. In particular it is assumed that there are no systematic wage differentials such that a given worker would be pleased to move from the fast food sector to the automobile manufacturing sector. Again the micro-econometric research is completely ignored.
5) A lot of standard academic DSGE models assume a closed economy.
6) No one thinks that the wage and price setting mechanisms assumed in either RBC or NK models are realistic. They are defended as convenient short cuts.
7) It is assumed that there are no hiring or firing costs (or unions which object to layoffs). Similarly the assumptions about costs of adjusting capital are not ones that anyone considered until it was necessary to make them to reconcile the data with the assumption that managers act only to maximize shareholder value.
8) Oh yes it is assumed that there are no principal agent problems in firms.
9) It is assumed that markets are complete even though they obviously aren’t and general equilibrium theorists know the assumption is absolutely key to standard results.
10) It is assumed that there is a representative agent even though there obviously isn’t and general equilibrium theorists know the assumption makes a huge gigantic difference.
This means that most of the topics which were the focus of old business cycle reasearch are ignored as are most post 1973 developments in microeconomics.
Before going on, I have to note that when each of these assumptions is criticized, special purpose models which relax the extreme assumptions are mentioned (sometimes they are developed after the criticism). But policy is discussed using the standard models. The assumptions are immune to evidence, because no one claims they are true yet their implications are taken very seriously.
What benefit could possibly be worth such choices ? That is what is wrong with a macroeconomic model with too many equations ? One problem is that complicated models are hard to understand and don’t clarify thought. This was once a strong argument, but it is not possible to intuitively grasp current DSGE models.
One reason to fear many equations is the experience of working with atheoretic vector autoregression (VAR) models which were developed in parallel with DSGE. in VARs the number of parameters to be estimated is proportional to the square of the number of equations. The number of observations of dependent variables is equal to the number of equations. More equations can imply more parameters than data points. Even short of that, large VAR models are over parametrized and fit excellently and forecast terribly. 7 equations are clearly too many. a 97 equation VAR just couldn’t be estimated. The CC approach relied on imposing many restrictions on the data based on common sense. A 97 equation DSGE model is, in pricipal, possible, but ideas about simplifying assumptions which should be made are, I think, based in large part on the assumptions which must be made to estimate a VAR.
If there are many dependent variables but each is explained by an ordinary number of independent variables each of which is instrumented by a credible instrument, then there shouldn’t be a problem with over-fitting. The fact that somewhere else in the model othere equations are estimated does not cause a spuriously good fit for an equation which doesn’t include too many parameters itself.
However, there is another cost of estimating a lot of parameters. The parameter estimation error makes forecasts worse at the same time it makes the in sample fit better. In the simplest cases, these two problems cause identical gaps between the in sample fit and the out of sample forecast. The second problem is absolutely not eliminated by making sure each equation is well identified.
But there is a standard approach to dealing with it. Instead of imposing a restriction that some parameter is zero, one can use a weighted average of the estimate parameter and zero. This is a Stein type pseudo Bayesian estimator.
I will give two examples. In the now standard approach, it is assumed that residential ivnestment is always exactly proportional to non residential investment. In the old approach residential and non residential investment were considered separately. In the pseudo Bayesian approach, one can estimate an equation for the growth of log total investment, estimate equation for the growth of log residential minus the growth of log total investment, then multiply the coefficients of the second equation by a constant less than one.
In another example one can assume that inventory investment is zero (as is standard DSGE models) or estimate net inventory investment as a function of other variables. Adding half the fitted net inventory investment to the standard DSGE model might give better forecasts than either the now fashionable or the old fashioned model.
This is the standard approach used with high dimensional VARs. I see no reason why it couldn’t be applied to CC models.
I see Wren Lewis has a new post which I must read before typing more (I have read it and type the same old same old so you probably don’t want to click “read more”).