The top marginal income tax rate should be about 65%…

by Mike Kimel

Cross posted at the Presimetrics blog.

To maximize real economic growth in the United States, the top marginal income tax rate should be about 65%, give or take about ten percent. Preposterous, right? Well, it turns out that’s what the data tells us, or would, if we had the ears to listen.

This post will be a bit more complicated than my usual “let’s graph some data” approach, but not by much, and I think the added complexity will be worth it. So here’s what I’m going to do – I’m going to use a statistical tool called “regression analysis” to find the relationship between the growth in real GDP and the top marginal tax rate. If you’re familiar with regressions you can skip ahead a few paragraphs.

Regression analysis (or “running regressions”) is a fairly straightforward and simple technique that is used on a daily basis by economists who work with data, not to mention people in many other professions from financiers to biologists. Because it is so simple and straightforward, a popular form of regression analysis (“ordinary least squares” or “OLS”) regression is even built into popular spreadsheets like Excel.

I think the easiest way to explain OLS is with an example. Say that I have yearly data going back to 1952 for a very small town in Nebraska. That data includes number of votes received by each candidate in elections for the city council, number of people with jobs, and number of city employees convicted of graft. If I believed that the votes incumbents received rose with the number of people jobs and fell with political scandals, I could have OLS return an equation that looks like:

Number of incumbent votes = B0 + B1*employed people + B2*employees convicted of graft

B0, B1, and B2 are numbers, and OLS selects them in such a way as to minimize the sum of squared errors you get when you plug the data you have into the equation. Think of it this way – say the equation returned was this:

Number of incumbent votes in any given year

= 28 + 0.7*employed people – 20*employees convicted of graft

That equation tells us that the number of incumbent votes was equal to 28, regardless of how many people were employed or convicted of graft. (Bear in mind – that first term, the constant term as it is called, sometimes gives nonsensical results by itself and really is best thought of as “making the equation add up.”) The second term (0.7*employed people) tells us that every additional employed person generally adds 0.7 votes. The more people with jobs, the happier voters are, and thus the more likely to vote for the incumbent. Of course, not everyone with a job will be pleased enough to vote for the incumbent. Finally, the last term (- 20*employees convicted of graft) indicates that every time someone in the city government is convicted of graft, incumbents lose 20 votes in upcoming elections due to an increased perception that the city government is lawless.

Now, these numbers: 28, 0.7, and -20 are made up in this example, but they wouldn’t have been arrived at randomly. Instead, remember that together they form an equation. The equation has a very special characteristic, but before I describe that characteristic, remember – this is statistics, and statistics is an attempt to find relationships based on data available. The data available for number of people employed and number convicted of graft – say for the year 1974 – can be plugged into the equation to produce an estimate of the number of votes. That estimate can then be compared to the actual number of votes, and the difference between the two is the model’s error. In fact, there’s an error associated with every single observation (in our example, there’s one observation per year) used to estimate the model. Errors can be positive or negative (the estimate can be higher than the actual or lower), or even zero in some cases.

OLS regression picks values (the 28, 0.7, and -20 in our example) that minimize the sum of all the squared errors. That is, take the error produced each year, square it, and add it to the squared errors for all the other years. The errors are squared so that positive errors and negative errors don’t simply cancel each other out. (Remember, the LS in OLS are for “least squares” – the least squared errors.) You can think of OLS as adjusting each value up or down until it spots the combination that produces the lowest total sum of squared errors. That adjustment up and down is not what is happening, but it is a convenient intuition to have unless and until you are someone who works with statistical tools on a daily basis.

Note that there are forms of regression that are different from OLS, but for the most part, they tend to produce very similar results. Additionally, there are all sorts of other statistical tools, and for the most part, for the sort of problem I described above, they also tend to produce similar outcomes.

I gotta say, after I wrote the paragraphs above, I went looking for a nice, easy representation of the above. The best one I found is this this download of a power point presentation from a textbook by Studenmund. It’s a bit technical for someone whose only exposure to regressions is this post, but slides eight and thirteen might help clarify some of what I wrote above if it isn’t clear. (And having taught statistics for a few years, I can safely say if you’ve never seen this before, it isn’t clear.)

OK. That was a lot of introduction, and I hope some of you are still with me, because now it is going to get really, really cool, plus it is guaranteed to piss off a lot of people. I’m going to use a regression to explain the growth in real GDP from one year to the next using the top marginal tax rate and the top marginal squared. (In other words, explaining the growth in real GDP from 1994 to 1995 using the top marginal rate in 1994 and the top marginal rate in 1994 squared, explaining the growth in real GDP from 1995 to 1996 using the marginal rate in 1995 and the top marginal rate in 1995 squared, etc.) If you aren’t all that familiar with regressions, you might be asking yourself: what’s with the “top marginal rate squared” term? The squared term allows us to capture acceleration or deceleration in the effect that marginal rates have on growth as marginal rates change. Without it, we are implicitly forcing an assumption that the effect of marginal rates on growth are constant, whether marginal rates are five percent or ninety-five percent, and nobody believes that.

Using notation that is just a wee bit different than economists generally use but which guarantees no ambiguity and is easy to put up on a blog, we can write that as:

% change in real GDP, t to t+1 = B0 + B1*tax rate, t + B2*tax rate squared, t

Top marginal tax rates come from the IRS’ Statistics of Income Historical Table 23, and are available going back to 1913. Real GDP can be obtained from the BEA’s National Income and Product Accounts Table 1.1.6, and dates back to 1929. Thus, we have enough data to start our analysis in 1929.

Plugging that into Excel and running a regression gives us the following output:

Figure 1

For the purposes of this post, I’m going to focus only on those pieces of output which I’ve color coded. The blue cells tell us that the equation returned by OLS is this:

% Change in Real GDP, t to t+1 = -0.15 + 0.63*tax rate, t – 0.48* tax rate squared, t

From an intuition point of view, the model tells us that at low tax rates, economic growth increases as tax rates increase. Presumably, in part because taxes allow the government to pay for services that enhance economic growth, and in part because raising tax rates, at least at some levels, actually generates more effort from the private sector. However, the benefits of increasing tax rates slow as tax rates rise, and eventually peak and decrease; tax rates that are too high might be accompanies by government waste and decreased private sector incentives.

The green highlights tell us that each of the pieces of the equation are significant. That is to say, the probability that any of these variables does not have the stated effect on the growth in real GDP is very (very, very) close to zero.

And to the inevitable comment that marginal tax rates aren’t the only thing affecting growth: that is correct. The adjusted R Square, highlighted in orange, provides us with an estimate of the amount of variation in the dependent variable (i.e., the growth rate in Real GDP) that can be explained by the model, here 17.6%. That is – the tax rate and tax rate squared, together (and leaving out everything else) explain about 17.6% of growth. Additional variables can explain a lot more, but we’ll discuss that later.

Meanwhile, if we graph the relationship OLS gives us, it looks like this:

Figure 2

So… what this, er, (if I may be so immodest) “Kimel curve” shows is a peak – a point an optimal tax rate at which economic growth is maximized. And that optimal tax rate is about 67%.

Does it pass the smell test? Well, clearly not if you watch Fox News, read the National Review, or otherwise stick to a story line come what may. But say you pay attention to data?

Well, let’s start with the peak of the Kimel curve, which (in this version of the model) occurs at a tax rate of 67% and a growth rate of 5.85%. Is that reasonable? After all, a 5.85% increase in real GDP is fast. The last time economic growth was at least 5.85% was in the eighties (it happened twice, when the top rate was at 50%). Before that, you have to go back to the late ‘60s, when growth rates were at 70%. It isn’t unreasonable, then, to suggest that growth rates can be substantially faster than they are now at tax rates somewhere between 50% and 70%. (That isn’t to say there weren’t periods – the mid-to-late 70s, for instance, when tax rates were about 70% and growth was mediocre. But statistics is the art of extracting information from many data points, not one-offs.)

What about low tax rates – the graph actually shows growth as being negative. Well… the lowest tax rates observed since growth data has been available have been 24% and 25% from 1929 to 1932… when growth rates were negative.

What about the here and now? The top marginal tax rate now, and for the foreseeable future will be 35%; the model indicates that on average, at a 35% marginal tax rate, real GDP growth will be a mediocre 1.1% a year. Is that at all reasonable? Well, it turns out so far that we’ve observed a top marginal rate of 35% in the real worlds six times, and the average growth rate of real GDP during those years was about 1.4%. Better than the 1.1% the model would have anticipated, but pretty crummy nonetheless.

So, the model tends to do OK on a ballpark basis, but its far from perfect – as noted earlier, it only explains about 17.6% of the change in the growth rate. But what if we improve the model to account for some factors other than tax rates. Does that change the results? Does it, dare I say it, Fox Newsify them? This post is starting to get very long, so I’m going to stick to improvements that lie easily at hand. Here’s a model that fits the data a bit better:

Figure 3

From this output, we can see that this version of the Kimel curve (I do like the sound of that!!) explains 36% of the variation in growth rate we observe, making it twice as explanatory as the previous one. The optimal top marginal tax rate, according to this version, is about 64%.

As to other features of the model – it indicates that the economy will generally grow faster following increases in government spending, and will grow more slowly in the year following a tax increase. Note what this last bit implies – optimal tax rates are probably somewhat north of 60%, but in any given year you can boost them in the short term with a tax cut. However, keep the tax rates at the new “lower, tax cut level” and if that level is too far from the optimum it will really cost the economy a lot. Consider an analogy – steroids apparently help a lot of athletes perform better in the short run, but the cost in terms of the athlete’s health is tremendous. Finally, this particular version of the model indicates that on average, growth rates have been faster under Democratic administrations than under Republican administrations. (To pre-empt the usual complaint that comes up every time I point that out, insisting that Nixon was just like Clinton in your mind is not the point here. The point is that in every presidential election at least since 1920, the candidate most in favor of lower taxes, less regulation and generally more pro-business and less pro-social policy has been the Republican candidate.)

Anyway, this post is starting to get way too lengthy, so I’ll write more on this topic in the next few posts. For instance, I’d like to focus on the post-WW2 period, and I’m going to see if I can search out some international data as well. But to recap – based on the simple models provided above, it seems that the optimal top marginal tax rate is somewhere around 30 percentage points greater than the current top marginal rate. The recent agreement to keep the top marginal rate where it is will cost us all through slower economic growth.

As always, if you want my spreadsheet, drop me a line. I’m at my name, with a period between the mike and my last name, all at gmail.com.

It occurs to me that I should probably explain why I used taxes at time t to explain growth from t to t+1, rather than using taxes at time t+1. (E.g., taxes in 1974 are used to explain growth from 1974 to 1975, and not to explain growth from 1973 to 1974.) Some might argue, after all, that that taxes affect growth that year, and not in the following year. There are several reasons I made the choice I did:

1. When changes to the tax code affecting a given year are made, they are typically made well after the start of the year they affect.

2. Most people don’t settle up on taxes owed in one year until the next year. (Taxes are due in April.)

3. Causation – I wanted to make sure I did not set up a model explaining tax rates using growth rather than the other way around.

4. It works better. For giggles, before I wrote this line, I checked. The fit is actually better, and the significance of the explanatory variables is a bit higher the way I did it.