Relevant and even prescient commentary on news, politics and the economy.

Public and Private Health Care Spending and Infant Mortality in 71 countries

by Tilman Tacke and Robert Waldmann

We don’t know if someone else has noticed this amazing fact: in a cross country regression, the ratio of public health care spending to GDP is negatively correlated with the infant mortality rate as one would expect, but the share of private health care spending in GDP is positively correlated. In a simple regression with including only log per capita GDP (corrected for PPP) as an additional explanatory variable, both coefficients have large t-statistics.

The positive coefficient on private health care spending becomes insignificant when other variables are included, but it does not become negative. The result is not due to the USA which is an extreme outlier in private health care spending over GDP.

The result is not simply due to a correlation between high public spending and low income inequality as it holds when log per capita GDP is replaced by log per capita income of each of the lower 4 quintiles (this is our original regression hence the low number of countries in the sample).

Update: I hope this version of the table is legible


update: an illegible version of the table above was deleted.

update II: Poorly labeled graphs added

OK two plots

Infant mortality and Public health care spending


the one above is of residuals of ln infant mortality on ln pc GDP, the year and private health care spending as a percent of GDP on residuals of public health care spending as a percent of GDP on those variables

The other (below) is of residuals of ln infant mortality on ln pc GDP, the year and public health care spending as a percent of GDP on residuals of private health care spending as a percent of GDP on those variables

infant mortality and Private health care spending

lnim private health care spending AV plot

Note two countries with high leverage due to very high private health care spending compared to what one would expect given the other variables. The most extreme is the USA which also has higher infant mortality than one would guess given per capita GDP, the year and public health care spending. The second most extreme is Uruguay which has the infant mortality one would expect given the other variables.

If I drop them the coefficient on private health care spending as a fraction of GDP goes up (you can see that it would in the scatter) to 0.150 from 0.134. The t-stat *increases* from 3.12 to 3.16 which is not what I expected.

dumping just the USA (which I have done ten times by now) gives a coeff of 0.117 and a t-state of 2.85.

The country with the highest infant mortality residual is South Africa. Adding HIV prevalence to the regression is one of the things that reduced the size of the strange coefficient on private health care spending. Dropping the USA and South Africa gives a coefficient of 0.0916 with a t-stat of 2.16, that is the USA and South Africa together provide a substantial part of the apparent evidence that private health care spending is bad for health.

After the jump I will report regressions done partly in response to comments. Right now I will not even try to make them easily readable (sorry) but just cut and paste STATA output with minimal explanation of variable names. Sorry. Click at your own risk

In response to comments partly.

First comment “why only 71 countries”. The answer is that we were looking at income distribution and infant mortality so we only used countries where we had estimates of quintile shares of income (q1 is share of first quintile, q2 second etc). Without consulting Tilman, I decided to replace the log per capita real income (corrected for ppp) of different quintiles with log per capita real GDP (corrected for PPP) to make a simpler table and to focus on health care expenditures. This causes the strange positive t-stat on private spending to become alarmingly large, that is, part of the positive coefficient is due to correlation of private health care spending and inequality. In my original post, I wrote that the coefficient was very non robust. This is partly because I only really think about regressions which control for inequality. Ignoring inequality (which is crazy) it is robust to many other variables with the exception of continent dummies and a combination of infant mortality related variables (each one doesn’t do it see comment thread for details).

Our original regression

reg lnim lnq1 lnq2 lnq3 lnq5 hexprivate hexpublic year

Source | SS df MS Number of obs = 71
————-+—————————— F( 7, 63) = 53.05
Model | 55.8407871 7 7.9772553 Prob > F = 0.0000
Residual | 9.47313907 63 .150367287 R-squared = 0.8550
————-+—————————— Adj R-squared = 0.8388
Total | 65.3139262 70 .933056088 Root MSE = .38777

lnim | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lnq1 | .1210792 .4716871 0.26 0.798 -.8215122 1.063671
lnq2 | -.0061793 .8658731 -0.01 0.994 -1.736488 1.72413
lnq3 | -.9650523 .5634675 -1.71 0.092 -2.091052 .1609477
lnq5 | .1299795 .2405159 0.54 0.591 -.3506532 .6106123
hexprivate | .0776184 .0389142 1.99 0.050 -.0001453 .1553822
hexpublic | -.1030077 .0416674 -2.47 0.016 -.1862733 -.019742
year | -.01175 .0529402 -0.22 0.825 -.1175424 .0940425
_cons | 31.50754 105.8999 0.30 0.767 -180.1165 243.1316

lnq1 is the log of the real pc income of households in the lowest quintile. Hex stands for health care expenditures as a percent of GDP.

Now to me, the interesting question is “Is private health care spending less effective than public health care spending” and certainly not ” is it actually harmful as suggested by this silly regression”.

To answer the interesting question we (OK Tilman) ran regressions including total health care spending and just private health care spending. the coefficient on private spending should measure the difference in effectiveness private minus public.

reg lnim lnq1 lnq2 lnq3 lnq5 hextot hexprivate year

Source | SS df MS Number of obs = 71
————-+—————————— F( 7, 63) = 53.05
Model | 55.8407871 7 7.9772553 Prob > F = 0.0000
Residual | 9.47313909 63 .150367287 R-squared = 0.8550
————-+—————————— Adj R-squared = 0.8388
Total | 65.3139262 70 .933056088 Root MSE = .38777

lnim | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lnq1 | .1210792 .4716871 0.26 0.798 -.8215122 1.063671
lnq2 | -.0061793 .8658731 -0.01 0.994 -1.736489 1.72413
lnq3 | -.9650522 .5634675 -1.71 0.092 -2.091052 .1609477
lnq5 | .1299795 .2405159 0.54 0.591 -.3506532 .6106122
hextot | -.1030077 .0416674 -2.47 0.016 -.1862733 -.019742
hexprivate | .1806261 .0545836 3.31 0.002 .0715495 .2897027
year | -.01175 .0529402 -0.22 0.825 -.1175424 .0940425
_cons | 31.50754 105.8999 0.30 0.767 -180.1165 243.1316


the t-state on private health care expenditures is huge 3.31 (I know it is hard to see in the table).

This stands up to inclusion of a huge number of variables including dummies for regions

tab region2

region2 | Freq. Percent Cum.
East Asia & Pacific | 5 6.85 6.85
Europe & Central Asia | 31 42.47 49.32
Latin America & Caribbean | 19 26.03 75.34
Middle East & North Africa | 3 4.11 79.45
North America | 1 1.37 80.82
South Asia | 4 5.48 86.30
Sub-Saharan Africa | 10 13.70 100.00
Total | 73 100.00
reg lnim lnq1 lnq2 lnq3 lnq4 hextot hexprivate year femlit reg2* hiv doctors saniurban
> sanirural waterurban waterrural fobesity femalesmoking

Source | SS df MS Number of obs = 65
————-+—————————— F( 22, 42) = 17.65
Model | 50.353018 22 2.28877354 Prob > F = 0.0000
Residual | 5.44750771 42 .129702565 R-squared = 0.9024
————-+—————————— Adj R-squared = 0.8512
Total | 55.8005257 64 .871883214 Root MSE = .36014

lnim | Coef. Std. Err. t P>|t| [95% Conf. Interval]
lnq1 | .0883713 .6329048 0.14 0.890 -1.188882 1.365625
lnq2 | -.0743277 1.134967 -0.07 0.948 -2.364785 2.216129
lnq3 | -.8936729 2.256105 -0.40 0.694 -5.446678 3.659332
lnq4 | .413619 1.592167 0.26 0.796 -2.799505 3.626743
hextot | -.1096067 .0445435 -2.46 0.018 -.1994991 -.0197143
hexprivate | .1511602 .0677885 2.23 0.031 .0143575 .2879629
year | -.0110253 .0642753 -0.17 0.865 -.1407381 .1186874
femlit | -.0010556 .0059176 -0.18 0.859 -.0129977 .0108866
reg21 | -.1253023 .4024909 -0.31 0.757 -.9375617 .6869572
reg22 | .289227 .4351484 0.66 0.510 -.5889381 1.167392
reg23 | .1065467 .3954925 0.27 0.789 -.6915895 .9046829
reg24 | .1840681 .4318435 0.43 0.672 -.6874274 1.055564
reg25 | .4121691 .6886778 0.60 0.553 -.977639 1.801977
reg26 | .0590227 .3634298 0.16 0.872 -.6744084 .7924538
reg27 | (dropped)
hiv | .0433275 .0307031 1.41 0.166 -.0186339 .1052888
doctors | -.0190481 .0762125 -0.25 0.804 -.1728513 .134755
saniurban | -.0039519 .0064778 -0.61 0.545 -.0170245 .0091208
sanirural | -.0092739 .0041118 -2.26 0.029 -.0175719 -.0009759
waterurban | -.0176387 .0130958 -1.35 0.185 -.0440672 .0087897
waterrural | .0052071 .0046397 1.12 0.268 -.0041561 .0145703
fobesity | .0013884 .0072406 0.19 0.849 -.0132237 .0160005
femalesmok~g | -.0019532 .0079665 -0.25 0.808 -.0180303 .0141239
_cons | 30.54749 128.6271 0.24 0.813 -229.0326 290.1275

The result is not not not due to the USA the variable reg25 is a dummy variable for the USA, since the USA is the only North American country in the sample. Dropping the USA from the regression has no effect on the results except that no coefficient on reg25 can be estimated (really zero effect I checked because I wasn’t thinking).

I would say that the evidence that private health care spending has a lesser effect on infant mortality than public health care spending is quite robust. There is no way to reliably determine the direction of causation or rule out omitted variables bias, but the robustness to inclusion of other variables convinces me that neither is the full explanation. This is relevant to reverse causation too if the newly included variable is the original cause of bad health which then causes high private health care spending. There is evidence in the data set that high HIV incidence causes high private health care spending. Including HIV incidence (the variable called hiv) does not zap the t-stat, so that’s not the whole story.

Comments (0) | |

Foreign films, Western Cultural Influence, and Divorce in Japan.

by Tilman Tacke and Robert Waldmann

Globalization has caused a decrease in cultural distinctiveness. We find indications of a link between the divorce rate in Japan after 1955 and the market share of foreign films. Foreign films in Japanese cinema, including Hollywood productions, may act as importer of Western values. The market share of foreign films has predictive power for three periods of decreasing divorce rates as well as the general convergence of the low Japanese divorce rates to higher Western levels. Using Japanese box office data since 1955 we show that a higher market share of international films is not only associated with an increase in the number of divorces, but also a decrease in the number of marriages. Both effects are especially strong before the 1980s. We explain periods of increasing and decreasing market share of foreign films and divorce rate with historical changes in cultural relations between Japan and the Western world.

pdf here


Comments (0) | |

The Silver Standard

Poblan Unico jamás será vencido

Extremely regular readers of AngryBear will have noticed that I became a guest contributor a while ago. Then I vanished. The reason is that I have become obsessed with Presidential Polling and don’t have much original to say about it. A dialog from last night between me and my 11 year old daughter

11) dad what are you thinking about
rjw) mmm
11) Obama right ?
rjw) yeah
11) why don’t you ever think about anything else
rjw) mmmm.

Now about polling. My favorite site (by far) if Very very impressive. I call it the silver standard, because, Nate Silver (aka Poblano) being a Democrat, would never support crucifying mankind on a cross of gold.

fivethirtyeight simulates elections and calculates a probability distribution for electoral votes won. He (they?) consider(s) both state level and nation wide disturbances. They note that polls tend to narrow over time. They estimate pollster reliability with data from actual voting (mostly primaries I think).

As of recently he also estimates pollster fixed effects or house effects “the tendency of certain polling firms’ numbers to tend to lean in the direction of one or another candidate”. This happens to be very important largely because the most prolific pollster — Rasmussen — has, relative to other pollsters a tendency to lean pro-McCain. They also had excellent performance in the primaries and count extra.

My one concern about is that the calculations are very complicated and not at all transparent. I would like to see some reporting on a larger set of simulations done with different assumptions. For example, the house effects correction seems to me to be conservative and I would like to see simulations with a more aggressive correction.

3) The house effect adjustment is enacted only in cases where we are at least 90% certain that there is a house effect. Even in these cases, we hedge our bets a little bit, by subtracting 166% of the standard error from the house effect coefficient.

I would rather see with/without house effects and the with house effects estimates with just the point estimates no setting to zero if not significant and no subtracting 166% of a standard error.

Silver links to and praises an article on house effects in national polls written by
Charles Franklin at (my second favorite site) . This is completely separate evidence of house effects, since fivethirtyeight’s raw data are state level polls. Rasmussen polls are significantly better for McCain than average polls.

much more after the jump.

OK here I just let go.

1. My problem with is that their trend calculations are waaaay too complicated. They use a Loess trend estimate (trend value at t estimated with weights depending on how long before/after t the poll was taken then report the fitted value for time = t). This means that new data shifts past values of the trend line which freaks me out. It also means that they say Obama is ahead in Ohio because it is about tied now and he used to be behind so they are convinced there is a significant trend. Also the initial estimates downweight outliers (not explained exactly how). I do not agree with doing this (see below). The calculations might be optimal but they are much too complicated to understand. I’d rather a point estimate based on averaging (weighted regression on a constant and no trend) and an estimate of the recent slope with a standard error reported as a number). Still, since I can get the recent simple average from I have no serious complaint (just one more url to type I do *not* hotlist pollsters)

2. What happened to the Gallup anomaly ? For years I have been reporting on the Gallup likely voter anomaly — Gallup polls are better for Republicans than other polls. In the Pollster analysis, Gallup is much less pro-McCain than is Rasmussen and is 4th best for McCain. The reason is simple. The Gallup anomaly is an aspect of the Gallup likely voter filter and the vast majority of Gallup polls reported so far are from the Gallup tracking poll of registered voters.

As Gallup explains every 4 years, their likely voter filter is not reliable long before the election. Gallup has been forced by the competition to report likely voter results earlier than they used to (I remember way back when). There was a very large huge Gallup likely voter anomaly in the poll conducted July 25-27 (click and search for USA Today in which Obama lead among registered voters and McCain lead 49 to 45 among likely voters. The likely voter pool was strongly biased against the young compared to actual votes in past presidential elections.

Gallup has an excellent record predicting elections. What this means is that the last Gallup polls before the vote are very accurate. That is, the likely voter filter works in late October. This does not mean that it works in August.

What is going on ? It is simple. Admirably, Gallup has stuck with the same method they used long long ago. This is transparent and honest (they aren’t using their success in the past with one method to justify their claims based on a new method). It is almost comprehensible how they decide who is a likely voter. The filter is based on answers to simple questions. from the Gallup FAQ

“These questions include asking whether or not the individual knows the location of his or her voting place, whether or not the individual voted in the past election, how closely the person is following the election, and so forth.”

Now obviously knowledge of the location of the voting place in August and in October will differ — some people learn where it is between August and October. It is not surprising that someone who claims he or she will certainly vote but doesn’t know where to go to vote in late October is not likely to vote. In August, the answer has, I would suspect, much more to do with how long the voter has been registered to vote at his or her current address than the probability that he or she will vote. The use of that question creates a selection against younger voters (and people who moved recently) stronger than the correlation of age and not moving and voting.

Even the assumption that eligible adults who are not currently registered will definitely not vote is dubious in August. There is still plenty of time to register.

In any case the evidence that the Gallup filter works in October tells us little about whether it works in August (as Gallup insists whenever asked and often when not asked).

There will be Gallup likely voter polls. There will be complaints about the Gallup anomaly. Democrats will be alarmed at Gallups excellent record (based on late October polls). It is all very simple and right there on the FAQ.

So what’s with Rasmussen ? Here a key feature is that they assume that partisan inclination (Dem Rep Independent) changes slowly so that differences from poll to poll in partisan inclination are mostly noise. They reweight so that the partisan inclination matches the average over the past 3 months.

Like all polling firms, Rasmussen Reports weights its data to reflect the population at large. Among other targets, Rasmussen Reports weights data by political party affiliation using a dynamic weighting process. Our baseline targets are established based upon survey interviews with a sample of adults nationwide completed during the preceding three months (a total of 45,000 interviews). For the month of August, the targets are 40.6% Democrat, 31.6% Republican, and 27.8% unaffiliated. For July, the targets were 41.4% Democrat, 31.5% Republican, and 27.1% unaffiliated (see party trends and analysis).

Now there is no need to smooth that much given the sample size (especially because Rasmussen could use data from other pollsters on party affiliation). This also shows the difference between a report which is optimal for the pollster and one which is optimal raw material for meta-analysis (here just fancy talk for averaging across pollsters). Weighting to make party affiliation fit a target reduces noise (increases precision) at the cost of introducing possible bias (if true support for the parties has shifted over time). For one poll the optimum balance may be to weight. However if one averages many polls (many Rasmussen polls or many polls across pollsters who do the same thing) the noise averages out and the bias doesn’t.

Now Rasmussen polls could be corrected by taking a more recent average (also across pollsters) of party affiliation and then using the internals (really simple like 90% of Dems for Obama and 90% of Republicans for McCain and easily available) to calculated a desmoothed Rasmussen based number.

The fact that they are making a very strong, very dubious assumption and have results which are strongly significantly different from the average pollster should give Rasmussen pause.

Finally note how quiet times are good for McCain. In quiet times the averages across pollsters are dominated by the Rasmussen and Gallup tracking polls. They both are more favorable to McCain than the average poll. Some of the alarm (among Democrats hope among Republicans) about Obama’s vacation, McCain’s celebrity campaign etc is based on this (I don’t know how much).

Comments (0) | |

Petroleum Speculation Thread N+1

I wrote something about crude oil, inventories and contango below and there were 57 comments including one, from Aaron which refered to this post in the future indicative.

A lot of the discussion is Krugman pro and con (mostly con). I think I will review my take on his arguments here as an introduction.

I had intended to write another post to clarify a particularly hand waving part of my old post (that is to report on a relative clarification in my own thoughts). I don’t think this is the focus of interest, so I will put that after the jump.

Krugman argued that the sharp increase in the price of crude oil was caused by increased demand and the fact that suppliers were already pumping just about all that they could pump (hence an almost vertical supply curve). I assume he thinks that the sharp decline is due to decreased demand. He is convinced that speculation in oil futures has not had a large impact on the spot price.

My recollection of his argument is that it was based on 4 claims (modified in part to respond to things I just learned skimming the older thread).

1) Oil is consumed and storage costs are significant. This makes analogies to housing, Nasdaq and tulips inappropriate.

2) Certainly people speculate in oil futures. The question is whether this is currently moving the spot price far from where it would be without speculators.

3) Pricing rules can determine prices, but don’t shift supply and demand curves. If spot prices move up automatically following futures prices, one would expect supply to exceed consumption — that is growth of inventories.

4) When an almost vertical supply curve meets and almost vertical demand curve, supply and demand can cause prices to move quickly huge amounts back and forth.

OK back to my “model”. Just to recall, the model assumes that the oil companies have formed a cartel and that it has become more difficult for them to keep each other in line. The driving force is low expected excess capacity (to ship and refine oil by them or to pump it out of the ground for their suppliers) makes it hard to punish a company which sells petroleum products at a price lower than the secretly agreed markup on the price of oil.

In a model, this would make them impose low inventories of crude oil and gasoline on each other and make them lower the markup increasing the price of crude and reducing the price of refined products including gasoline compared to what it would be if they could precommit to their cartel.

The really shaky part is I then claim that low inventories make them bid against each other more fiercely in the spot market so that all of the benefit from their reduced markups goes to oil exporters (and maybe then some). This is shaky, because I have forgotten the little I knew about the mechanisms of the the oil spot market and it probably isn’t the mechanism which I would need for the argument to make sense and … lots of stuff.

So I have a new way of putting it. Each Oil company can’t hold large inventories as that would give them an incentive to break their cartel (dumping the gasoline before the other companies can retaliate and benefiting from the increase in the price of crude oil). I will just assume that large inventories of crude oil are needed to keep refineries working at full capacity. If the refiners can’t hold as much crude in inventory, their suppliers will hold more. Now the price of crude includes the cost of holding that inventory essentially the oil exporters are supplying oil *and* storage. The price is higher than the price of just oil.

Now I do *not* believe that this model has anything to do with the real world. I do not think the OIL majors are colluding and I don’t think they would act like agents in game theory if they were. I don’t think their markups or storage costs are anything like large enough to fit the huge shifts in price. In fact, I agree with Krugman.

My old post was an exercise in economic theory. Fortunately commenters used it as an invitation to talk about the real world.

Tags: Comments (0) | |

Petroleum speculation without contango or growing inventories ?

As I’m sure AngryBear readers know, Paul Krugman does not believe that the spot price of petroleum shot up due to speculation. His argument is that the only way future expected prices can affect demand for crude or supply of refined products to final consumers is via inventory accumulation and inventories haven’t increased. Also he argues that speculation can only affect the spot price if there is contango: that is a futures price above the spot price.

I was convinced. Now I am not so sure. The recent decline in the price of petroleum makes it a little bit harder for me to believe in a simpl supply and demand without speculation explanation (just a little bit harder so I won’t argue the semi hemi demi point).

There are many models in which prices do funny things. One set includes customer market models — very implausible if the product is gasoline. Another set is based on implicit collusion. In this case, lets assume that the oil companies are, in fact, a cartel and enforce cooperation with threats of future retaliation. The subgame perfect semi folk theorem suggests that a continuum of equilibria are possible in this case, which sure helps if one is trying to fit the data.

I am (as always) thinking as I type. The semi model I have in mind is one in which

1) the oil companies buy crude on a thick duble auction type spot market with one worldwide price and close to perfect competition.

2) they refine crude and sell refined products (for simplicity assume that the only product is gasoline) subject to a capacity limit and, of course, demand.

3) they agree on a markup on marginal cost. Firms which sell gasoline at a lower markup are punished in the future. They choose the highest sustainable markup.

4) they agree on the highest sustainable markup and have rules restricting inventory accumulation and forward purchases of oil (key that).

The cartel will drive the price of gasoline up and the price of crude down. The extent to which it can do this depends on the costs and benefits of deviation from the agreement, that is, the gains to a firm of suddenly selling gasoline cheaper than agreed given the spot price of crude and the costs to that firm of the wrost subgame perfect punishment strategy available to the cartel.

It is very important to my story that the firms agree on a markup (let’s say a price of gasoline as a function of the price of crude) and NOT on prices for gasoline or crude or quantities of crude bought or gasoline sold.

Update: This is my second try. My first try was mathematically wrong.

The key issues are gains and costs from deviation from the agreed markup.

I will assume that following deviation, firms switch to the non-collusive oligopolly solution (make it a cournot oligopoly) with a lower price of gasoline and a higher price of crude.

If it took a long time for gasoline stations to change their posted price, an oil company with a chain of gas stations could … well this is silly.

There are gains to deviation from the agreement if the other oil companies in the cartel have limited inventories of gasoline and either limited inventories of crude limited refining capacity limiting their ability to increase their supply of gasoline. I assume that spare refining capacity is key to enforcement. The idea is that shipping refining and distributing takes a while so firms don’t do it on the sly. Spare refining capacity is key to the punishment phase but not to the deviation phase. Limited spare refining capacity implies a low markup. In particular, anticipated future refining capacity is the key (the punishment phase lasts a long time) so expectations of future demand are key.

The cost of deviation is that all firms in the future use all of their refinineries at capacity (and maybe build more).

The benefit is dumping undesired inventories of gasoline on the market and an increase in the price of crude oil which is valuable if the company owns crude oil in the ground or in tanks or has bought crude oil futures beyond their needs for crude oil to refine.

A tight incentive compatibility constraint due to limited refining capacity implies a low markup and tight restrictions on inventories (of both gasoline and crude oil) and on futures positions.

A low markup implies a high price of crude oil. Also low inventories (required to maintain collusion) and limited refining capacity imply a high price of crude. It is important that the collusive agreement allows firms complete freedom on the spot crude oil market. The idea is that the price jumps around so much that any collusive agreement would be way to complicated to work tacitly.

So low spare refining capacity implies low allowed inventories and futures purchases which implies fierce competition for crude oil (if a firm bids low it will have trouble keeping its refineries working and can’t make up later as it has limited spare capacity).

All is driven by forecasts of future demand which can bounce around as much as GNP forecasts.

Comments (0) | |

Wang and Silver on electoral projections

Sam Wang explains why he reports a 99% probability of an Obama win and has only a 62.4% probability.

I learned a lot from his post due to my incredible ignorance. I go to often enough that Firefox proposes it first when I type www, but I had never bothered to read the description of the method used to calculate the probabilities.

In case others are as lazy as me (unlikely) or have lives (likely) I will discribe my ignorance below after discussing issues of interest to the non pathetically ignorant.

Today I’d like to outlline the basic contrasts between this calculation and a popular resource, That site, run by Nate Silver, a sabermetrician, is a good compendium of information and commentary. However, both our goals and methods differ on several key points. The biggest difference is that this site provides a current snapshot of where polls are today, while he attempts a prediction. His approach also has a conceptual problem…

I think the conceptual problem is that Silver calculated probabilities from 10,000 simulations and Wang uses an analytic formula.

Silver’s approach is to carry out thousands of simulations, then tally the simulations. That method reflects the fantasy baseball tradition, in which individual outcomes are often of great interest. However, such an approach is intrinsically imprecise because it draws a finite number of times from the distribution of possible outcomes. The Meta-Analysis on this site calculates the probability distribution of all 2.3 quadrillion possible outcomes. This can be done rapidly by calculating the polynomial probability distribution, known to students as Pascal’s Triangle.

Wang claims that Poblano (AKA Nate Silver) should have obtained a normal distribution for electoral college votes. I don’t agree. This is only true if there is no correlation between shifts in support for Obama and McCain in different states. As usual, I argue using an extreme example. Assume no sampling error (each poll is of the whole population) and perfect correlation of changes in support in different states. If this were true then the ranking of states by Obama minus McCain would not change and there would be only 50 different possible outcomes in the electoral college. That’s not a normal distribution. I think that the argument is valid unless changes in support in different states are independent. This is a very implausible assumption. (note young Ezra who is neither a statistician nor a political scientist made this argument before I did).

Now Wang also argues that 10,000 simulations aren’t enough. I agree. I recently calculated something using 1,000,000 simulations for each of several different parameters (actually just 2 sample sizes). This was a distribution which I think I derived analytically. The millions of simulations were to check my reasoning, my algebra and, especially, my typing when writing the program which calculates the analytically exact distribution (the fact that I fail to reject the null that it is accurate with 1,000,000 simulations convinces me that I typed write for wunce).

The convention that simulations are repeated 10,000 times is a historical artifact of the slow pc age. I would like to ask Silver how long his computer takes to simulate. I would guess that his simulations are quicker than some of mine and waiting for 1,000,000 simulations was barely a nuissance.

Just to go back to my other obsession. I blame microsoft. I don’t think people fully realise just how much faster cheap pc’s have become, because microsoftware is designed to run intollerably slowly on any but the latest generation computers so computers take as long as ever to boot up, open a word file, open excel or well all that stuff even though they also take as long to do 1,000,000 simulations as they used to take to do 10,000.

I’d guess that 1,000,000 simulations won’t change Poblano’s calculated probability much and I would bet that he does them and reports the result.

OK what I should have known already.

I knew that Poblano (AKA Nate Silver) used old polls as well as the latest polls. His success during the primaries shows that true shifts in opinion were of limited importance. I did not know that he used a weighted average with the weight decreasing exponentially so that they fall by half after 30 days (weight = 0.5^(age/30days)*(other stuff) and that in past elections this calculation predicts better than others he tried. I see that just as I finally read the old faq (the link above) Silver wrote a new one (still doing 10,000 simulations)

Worse I didn’t even know that he considers the correlation of future changes in support for different candidates in different states.

It can reasonably be argued that I’m essentially double-counting the amount of variance by accounting for both state-specific and national movement. That is, some of the error in state-by-state polls is because of national movement, rather than anything specific within that state. However, I have chosen to account fully for both sources of error, because (i) this is the more conservative assumption, and (ii) I suspect that 2004, where voters divided into Bush and Kerry camps early, was inherently a more stable sort of election than 2008 is likely to be.

I had assumed that he did something like what Wang did, so the 67% was a snapshot not a forecast. I am pleased and reassured.

Comments (0) | |

What’s dumb about a windfall profits tax?

The always smart Kevin Drum writes

The windfall profits tax is a dumb idea, and I wish Obama didn’t support it, but I guess politics is politics. It’s not the biggest deal in the world.

I asked in comments what’s so dumb about a windfall profits tax. I haven’t checked how many commenters responded to me, but some which I’ve found are after the jump.

My thoughts on a windfall profits tax on Oil companies (I consider an additional rebate a separate issue).

I start with the simplest assumptions so it is assumed that the tax won’t affect incentives, because it refers to the past and it is assumed to be a one off move (starting simple). Also assume the very old theory of the firm which acts in shareholders interest and is not liquidity constrained. In that case, the tax is a tax on oil company shareholders who are (including people with 401(k)’s) relatively rich. So I like it.

It seems, see below, that oil companies are passing their profits to shareholders through share buybacks. I think this helps support the old theory of the firm assumption. However, if they insistend on reinvesting profits, I would consider that an additional reason for the windfall profits tax. Last time they had a windfall (the second oil shock due to the Iranian revolution and the Iran-Iraq war) they decided to diversify and made some of the least productive investments in US history. Have you ever heard of Exxon office systems ? Big investment in smart typewriters which were like pc’s but dumber and ten times as expensive. Oil companies handle oil. They are not suited to act as investment bankers and, still less, as venture capitalists. generally high profits are a sign of skillful management which maybe can improve firms they take over. In this case, it was dumb luck. I’d say stock buybacks are the lesser waste, but reducing the deficit would be much nicer.

Now what if they assume that this is the first in a series. Often confiscation now and never again would be good policy if the promise were credible. The belief that there will be more windfall profit taxes in the future seems to me to be desireable. It means that Oil companies don’t gain as much when the price of oil goes up. Since they are imperfectly competetive, it seems to me that this would drive their actions twoards the social optimum. For example, if they aren’t helped with oil prices go up, they won’t oppose a carbon tax so fiercely. Also actual investment in alternative fuels now has the cost that production of alternative fuels causes the price of oil to decline. The threat (or promise) of further windfall profits taxes would increase their incentives to invest in alternative fuels. If designed rationally (see below) it would also increase their incentive to look for oil (which isn’t so key if there really isn’t so much left to be found).

I’d propose making a forecast for Oil holdings (inventories including proven reserves underground, crude in tankers and tanks and unsold petroleum products) and tax alpha times the price times this quantity from oil companies (with alpha positive but less than one). If the forecasts were exact, this would make the oil companies act as if they were perfectly competative. They won’t be, but so long as they are not so optimistic as to drive a company bankrupt, the costs of forecast errors will be a transfer plus something second order in the forecast errors. No such policy is like making a forecast of zero which is worse even than my forecasts.

OK at least one flame war from Drum’s site

The problem with a windfall profits tax is that it is the kind of ad hoc solution that Obama criticized when it took the form of a gas tax holiday. It may be a more effective gimmick, but it’s still a gimmick: pure pander.
Posted by: lampwick on August 4, 2008 at 12:21 PM | PERMALINK

Huh ? more like the opposite. Given that Oil refining and pumping are close to capacity, a gas tax holiday is similar to a windfall profits subsidy not tax. Both are ad hoc, but they have opposite effects. I would consider an argument that there is something wrong with a windfall profits tax to be an argument that something bad will happen if one is imposed.

Big Oil’s profits by percentage are far less than Google’s, for example. Google’s profit is around 25%. Now there’s a windfall. Why aren’t we taxing them extra heavy? No way they should be making that much money when there are starving people out there.

Posted by: SJRSM on August 4, 2008 at 12:23 PM | PERMALINK

This is interesting. First many of my arguments apply to an increase in price and not to increase in production (searches for google I guess). Second the windfall was a windfall (the behavior of inventories suggest that Oil company executives didn’t even see it coming as, say, Kevin Drum did). The google guys have demonstrated an ability to create wealth. I really like the idea of google venture capital. They got money by being very smart. I suspect that they will do smart things with it and won’t capture all of they wealth they create.

The only thing I could support was a windfall profits tax where the companies get to deduct from their payment every dollar they spend on renewables.

But as a general rule, resentment and vengeance are not good foundations for tax policy.

Posted by: lampwick on August 4, 2008 at 12:41 PM | PERMALINK

I’m not sure I want oil companies working on renewables. Why do we think they are better suited to manage that than other firms (I love what Shell is doing by the way) ? I am more pro market than lampwick so I think incentives to invest in renewables should be those implied by a carbon tax and not directed at any particular firms.

Remember the last time we based tax policy on resentment (hint did a plurality of Americans tell a Clinton pollster that they supported increased taxes on the rich to fund waste fraud and abuse). I’d say that the record of US tax policy based on resentment is, as a general rule, excellent.

There already is a tax on that “windfall.” It’s called the corporate income tax, and Big Oil is currently paying Uncle Sam to the tune of billions as their profits skyrocket. The proposal to effectively increase the corporate tax rate on a specific industry is unwise because consumers will pay for the tax increase in the form of higher fuel bills (as energy companies reduce exploration and development budgets). They’ll also “pay” to the extent that energy stocks are found in their portfolios.

Posted by: Jasper on August 4, 2008 at 12:47 PM | PERMALINK

a windfall profits tax isn’t just an increased tax on profits. It depends on the price times inventories not production minus costs. My proposal would, if anything, increase the incentives to oil companies to explore and develope oil fields. Someone in a thred below pointed out that average != marginal. I think Jasper really doesn’t understand that, but it could be that he assumes that the windfall part will be a fraud and will be perceived to be a fraud.

Yes, the windfall tax is a bad idea, and the energy rebate is even worse. We’re at a sad state when we count on stipends from the government to help spur the economy. What we need are policy changes that will impact how people buy and use energy in the long-term, not a panacea fix.

Posted by: MeLoseBrain? on August 4, 2008 at 2:16 PM | PERMALINK

Yes we must make the best the enemy of the good. Plus Keynesianism unbuilds character.

Someone notes that Becker and or Posner claim that the windfall profits tax was not good policy. Support for their assertion is missing. But the great part is this

.. “I was careful with my reference not to use a right wing source for the argument. Unless Becker-Posner is a right wing source.”

The scary thing is that I don’t think the commenter was joking.

I’m not done with the thread but I’m sure I’ve exhausted your patience.

Comments (0) | |

The Company You Keep

Ah something which I actually know something about.

Dana Goldstein has found an amazing fact which, if middle class parents believed it was representative, would make the world (and in particular the USA) a much better place

So were the litigious Fairfax parents correct to freak out about South Lakes? Let’s look at the numbers.

At South Lakes High, 46 percent of students are white, 20 percent are black, 16 percent are Hispanic, and 11 percent are Asian. One-third of the school’s population qualifies for free or reduced-price lunch. In other words, this is both a racially and socioeconomically diverse school. How does this affect the most academically talented/privileged proportion of the student body? Well, more than half of white kids and almost half of Asian kids participate in the IB program, as do about 20 percent of blacks and Hispanics. An overwhelming majority of all the students enrolled in IB score a 4 or better, indicating excellent instruction and achievement. As for the SAT, the average combined score for white kids at South Lakes is 1730 out of 2400.

Now let’s look at Oakton High School, which affluent parents sued to get their kids into. Oakton is 67 percent white and only 11 percent black and Hispanic. Less than 9 percent of students there qualify for free or reduced-price lunch. Oakton has an AP program in which white students are just as successful as their similar white peers at South Lakes are on the IB exams; of the black students participating in AP though, less than half scored three or higher. Tellingly, on the SAT, Oakton’s white kids score 1734, essentially the exact same score as white students at South Lakes.

My point: The educational outcomes of privileged kids are remarkably similar across schools with similar curricula, while it is the least advantaged students who show more differentials. When parents are considering where to send their kids to school, they should look at the relevant numbers.

via Ezra Klein.

Sad to say, the whole world isn’t Fairfax county (well that would be a very boring world, ethnicly diverse but culturally a bit focused on US politics). In most developed countries, average educational attainment of mothers of students is positively partially correlated (positive significant regression coefficient) with individual student performance (on the PISA test) even if the educational attainment of the student’s parents is included in the regression.

see this (mainly for the references if you can get to them).

Comments (0) | |

Politically impossible health care cost sharing

This is odd. Matthew Yglesias just excerpted a bit of a post at my personal blog, which I didn’t think was up to AngryBear standards. Far from it for me to question the judgment of a man who recently reached half my age. Here is the bit he liked

One politically unfeasible approach to this would be to assign people randomly to HMO’s and pay the HMO’s based on their health but have the HMO’s pay for their health care. Then the HMO decides incentives. You have to decide how much a life is worth (and eyesight and all that) but it doesn’t depend on individual income and the decisions are made by an organisation with tons of data.

Comments (0) | |