Spurious Correlation of the Day

Correlation is not causation, more research and testing is required, etc.

I was working from the concept that home Internet service is a luxury item—or, at the very least, non-essential.*  In short, that you would tend to give up home Internet access if the choice is between that and staying current on your mortgage.

Looking at the State-level data, though, produced the following regression:

HomeINetAccess = 0.78736*(FICO>660) – 0.1934*(Pct with Current Payment) – -0.1662*(Lying Broker Loans) + 73.36

R-squared = 0.4311 Adj. R-squared = 0.3956**

Fortunately, only the FICO>600 (t=4.10) and the constant (t=7.77) were clearly significant at the 95% confidence level. (Current t = –1.38, Lying Broker t = -0.95). And it seems intuitively obvious that people with better credit scores are more likely to be able to afford (and demand) home-based Internet access.

Removing the “Lying Broker Loans,” strangely, didn’t change the sign, though it did reduce the perceived effect and lower the base constant.

HomeINetAccess = 0.65055*(FICO>660) – 0.1159*(Pct with Current Payment) + 67.42

R-squared = 0.4204  Adj. R-squared = 0.3967

Fortunately, Pct with Current Payment remains an insignificant variable (t = –1.02); indeed, it becomes even more unlikely.

Curiously, there is one random regression that does appear significant.

HomeINetAccess = 0.43854*(FICO>660) – 0.3099*(Mortgage Originated in 2005 or before) + 80.89

R-squared = 0.429; Adj. R-squared = 0.4057

Here, both variables and the constant appear significant (t=3.5, –3.81, and 14.16, respectively).  So we need a story to explain the negative sign, especially since running the same regression against  the“Originated in 2006” or “Originated in 2007” values produces a larger R-squared and results with the intuitively-correct sign:

HomeINetAccess = 0.40822*(FICO>660) + 0.5340*(Mortgage Originated in 2006) + 47.62

R-squared = 0.5217; Adj. R-squared = 0.5022; t(FICO>660) = 2.98  t(2006) = 3.41

HomeINetAccess = 0.53598*(FICO>660) + 0.5476*(Mortgage Originated in 2007) + 55.02397

R-squared = 0.5034; Adj. R-squared = 0.5112; t(FICO>660) = 4.63 t(2007) = 3.57

So people who bought at the peak of the bubble, or even when the bubble was beginning to break, are more likely to have Home Internet access than those who have been living in their house for a longer period of time.  Indeed, having lived in your house for a longer period of time correlates negatively, on a State level, with having Home Internet access.

Were we to speculate, we might guess that people who have been living in their homes longer did not have Internet access easily available and affordable when they bought their home, and have not decided to add it now.  (This would imply either that there are major transaction costs associated with gaining Internet access or that the people who bought in the pre-2006 environment are resource-constrained in other ways.)

As a reasonable speculation, people who bought in 2006 and 2007—arguably, the top of the market—have (or believed they have) less price sensitivity than those who bought while the bubble was inflating.  This might suggest that the people who were buying in 2006 were more likely to be “trading up” than buying for the first time. There is anecdotal evidence to that effect. Looking at the graphic of U.S. home ownership percentage:


it appears that by 2006, the market consisted more of homeowners and speculators than it did new buyers, but the data I’m using does not have the granularity either to accept or reject that hypothesis.***

In any event, further research appears to be needed—or, maybe, this is just the Spurious Correlation of the Day.

*Jim Henley—and any other parent whose daughter is a Club Penguin devotee (for instance, me)—might disagree.

**Those not in the social sciences will look at these R-squared values and wonder if there is anything being presented.  40% is, I am told, a very good result.  Indeed, since the entirety of Real Business Cycle theory is hung on an R-squared close to 0.50, certainly a finger exercise with a result that is only 80% of that would be, if not earth-shattering, then at least publishable.

***Suggestions for sources that might indicate whether buyers were speculators—e.g., state-level data that indicates if property was being purchased to be a primary residence or second (“vacation”) home—might be available are welcome in comments or via e-mail.