The Bounds on what we are Likely to Learn from Models with Boundedly Rational Learning

Mark Thoma and Simon Wren-Lewis both responded to a complaint about the rational expectations assumption by noting that they had colleagues who study boundedly rational learning. I see a third alternative — giving up. That is, we might conclude that we don’t understand expectations formation and that we won’t for a long while, so we should just decide that economic agents believe the darndest things. This implies giving up on calculating a conditional probability distribution of future outcomes. One way to implement the just throw in the towel strategy is to say that there is a 90% probability that a model including the rational expectations hypothesis will predict the future as well as it fit the past and a 10% chance that something weird and unexpected will happen, because economic agents get some strange notion in their heads.

But what is my problem with boundedly rational learning ? Well first of all actual learning is clearly monstrously complicated (especially when the learners get together and decide to teach each other). Any model must be a gross simplification which might be a useful approximation and might not. So we are back to maybe it will work ok and maybe not (and we don’t know the probabilities).

The worse problem is that the exact model of expectations formation typically matters a lot. Equally plausible models can have extremely different implications. Equally plausible models which fit the same data equally well can have extremely different implications. I provide an example.

One model of boundedly rational learning is that agents choose a specification and estimate coefficients by OLS. So far we have a really dumb model (like adaptive expectations) such that policy makers can fool all the people all the time. But wait, the agents test their identifying restrictions — they consider alternative specifications one of which, with the right coefficients, happens to be the optimal forecasting rule. They switch specifications if the currently favored specification is rejected against an alternative specification at the alpha significance level.

This model has the desirable feature that you can’t fool all of the people all of the time, that is it has the desirable trait of the rational expectations assumption.

OK so an example of the example. The monetary authority can set the inflation rate except with an iid noise term. Agents set prices one period in advance based on their forecast of next period’s price level. The monetary authority has an ambitious output target (wants to trick people into producing more than they would in the flexible price equilibrium). Inflation is slightly costly.

If the agents have rational expectations, the best policy is to set inflation to zero. I assume that the monetary authority can pre-commit, so it would do that.

Now assume agents are the pick a specification and keep it until it is rejected at level alpha against another specification (in a set including the specification which allows rational expectations). Let’s say they start with the guess that inflation is a mean zero white noise (so the best estimate is always 0). So long as they stick to that specification, the old expectations un-augmented Phillips curve describes the economy.

The monetary authority can hide in the noise and set inflation above zero for a while. It will do this. So what ?

Two things, one of which matters and the other of which is a mathematical joke.

Well first, I haven’t told you what alpha is, have I. I am sure that if such a model of boundedly rational learning is simulated alpha is set to 5% cause that’s just the way we hang. But what if it were ten to the minus tenth ? In that case, the optimal policy might be to set expected output to the ambitious target for the next hundred years. The implications for policy are totally different than rational expectations for until we will all be dead. In particular the implications for any data set we have are totally different.

The critique of models of expectations other than rational expectations is always based on arguments about what happens as t goes to infinity (when we will all be dead). They don’t rule out different assumptions which have implications for all time up to 2012 or 2112 which are as different as implications can be.

Second, the optimal monetary policy contains a stochastic term. The agents estimate the variance of the white noise disturbance to inflation. If the monetary authority randomly changes it’s inflation target, it imposes second order costs. It has a first order effect on the estimate variance of the disturbance term. This allows it to hide longer under the noise. So the optimal amount of noise to add to inflation is positive.

Brad DeLong made similar points in unpublished manuscripts.