I had an informal discussion with a manager in an MBS IT area last month. Just a general conversation about the field and the data people check. He mentioned FICO scores and I noted that I’m not fond of using them to evaluate a mortgage, especially for first-time homebuyers.

Part of this is simple: it’s relatively easier—even in the densely-populated metropolitan areas (e.g., NYC, SF), and certainly in sub- and exurban areas—to maintain a good credit rating if you don’t own a residence. No property taxes, no major repairs, no appliance replacement, no general maintenance, no landscaping, no snow shoveling. And it’s very easy, especially the first time, to underestimate just how much those expenses will be. Looking at just the cost of commuting, renting, storage, parking, etc. makes homeownership appear to be a better economic decision than it is.*

Well, the Federal Reserve Bank of New York recently released some data on mortgage payments by type. It’s not directly comparable—the subprime and Alt-A loans have a more granular level of data, most especially with respect to late and current payments—but there are some interesting relationships.

I looked at the data for States where the subprime loans are current for either (1) more than 55% of the borrowers or (2) less than 45% of the borrowers, which includes 24 states and the District of Columbia. The overall breakdown was 16 states in the first group and eight states and the District of Columbia in the second.

Of the six states that have more than 100,000 subprime loans outstanding, three—Illinois, Florida, and California—are in the More Delinquent category, while only one (Texas) is in the “so far, so good” realm.**

So I ran a regression on those states and the District, using as factors the percent of the subprime loans that were not Owner-Occupied, the Average FICO score for the state, the percent of subprime loans issued to borrowers with a FICO below 600, and the percent of subprime loans issued to borrowers with a FICO score above 660. The result was

PctwithCurrPymt = –1.18*(FICO>660) + .292*(FICO<600) + .266*(Average FICO Score) –0.9*(Pct Not Owner-Occupied) –93.66

R-squared = 0.4213 (Adjusted R-squared= .3056) F = 3.64 (Prob > F = 0.0220)

However, none of the coefficients passes the t-test.

If we assume that there is a solid distinction between a FICO score below 600 and one above 660, then we must note that the signs of this regression are *precisely the opposite* of what we should expect. The more loans with an initial FICO score above 660, the fewer the number of households that are expected to be current in their payment. Conversely, the more households with a FICO score below 600, the better the Current Payment Performance should be expected to be.

This would seem to be a Very Bad Regression—both methodologically, since it takes two separate sets of data and treats them as if they are part of the same set and intuitively, since it produces results that are not compatible with rational assumptions—but that may not be so.

California, for instance, has the third-highest percentage of Owner-Occupied Properties, the highest Average FICO Score, the lowest percentage of subprime loans to borrowers with FICO scores below 600 and the highest percentage of subprime loans to borrowers with a FICO score above 660. But it falls into the group where fewer than 45.0% of the borrowers are current.***

Which means that, were you to use FICO scores as an input to your model for buying Whole Loans to securitize, you would likely have bought more currently-dicey CA paper than not.

But, as noted, we may believe this to be a Very Bad Regression. The greatest likelihood is that there is/are (an) excluded variable(s) in the equation. If we consider the entire set of data, this becomes clearer. The regression equation for all of the states and the District of Columbia is:

PctwithCurrPymt = **–1.019*(FICO>660)** **+ .6118*(FICO<600)** + .7685*(Average FICO Score) –0.38*(Pct Not Owner-Occupied) –422.80

R-squared = 0.1471 (Adjusted R-squared= .0730) F = 1.98 (Prob > F = 0.1128)

The signs remain consistent—and counterintuitive—but there is a much lower explanatory power and it is much more likely that the regression fails the F-test. And again, none of the coefficients passes a t-test.

Adding variables whose signs are more likely to produce indeterminate results—the Average Age and the Average Interest Rate of the Loans—corrects the two original signage issues, but produced a third (and possibly a fourth):

PctwithCurrPymt = 1.375546*(FICO>660) –1.639*(FICO<600) **– 1.3423*(Average FICO Score)** –0.223*(Pct Not Owner-Occupied) **+ 16.5340 AvgInterestRate** + 0.2632 AvgLoanAge + 775.9700

R-squared = 0.4661 (Adjusted R-squared= .3991) F = 6.40 (Prob > F = 0.0001)

The additional variables have significantly raised the explanatory power of the model, and we now see that the FICO scores point in the intuitive directions. But the Average FICO score has ceased to be a positive contributor to the model, and the Average Interest Rate—the only variable that passes a t-test for significance—indicates that the higher the rate, the higher the likelihood of payment.

So we are left suspecting that the initial FICO score does not significantly affect the ability of the borrower to keep their loan payment(s) current. This also seems intuitive, since a FICO score is a stock variable, while mortgage payments are flow variables.

But, as with credit ratings, good FICO scores can only go downward. And it is very rare—especially in an environment in which there is downward pressure on wages—for a good FICO score to go upward. Indeed, dropping the positive FICO score and the Average FICO score as a variables makes for a better regression:

PctwithCurrPymt = –0.663*(FICO<600) –0.238*(Pct Not Owner-Occupied) + 15.4976 AvgInterestRate + 0.1469 AvgLoanAge – 47.325

R-squared = 0.4466 (Adjusted R-squared= .3985) F = 9.28 (Prob > F = 0.0000)

While the Average Interest Rate still has a counterintuitive sign, we should note that the Averages range from 6.69 to 8.66%—even the high end is neither an overwhelming burden for subprime borrowers nor a level from which it is likely to have been worth refinancing. Additionally, while AvgInterestRate remains the only coefficient that completely passes a t-test, both FICO<600 (-3.17) and the constant (-2.54) are negative for all values within a 95% confidence interval.
Dropping Non-Owner-Occupied from the equation sharpens matters even more:

PctwithCurrPymt = –0.6747*(FICO<600) + 15.7738 AvgInterestRate + 0.1400 AvgLoanAge – 50.7117
R-squared = 0.4407 (Adjusted R-squared= .4050) F = 12.34 (Prob > F = 0.0000)

With the t-values for both FICO<600 (-3.25) and the constant (-2.83) now both more than 99% probable and, again, the values being negative for the entirety of a 95% confidence interval.
In summary, the use of FICO scores as a predictor of mortgage repayments appears to be questionable at best, for the same reason that “junk” bonds tended to outperform high-grade securities on a risk-adjusted basis: it is much easier for a rating to decline than it is for it to improve. The value of a FICO score as a predictor of loan performance appears to be much more for lower scores than it is for higher ones. Whether there is greater value on a risk-adjusted basis, as there legendarily has been for corporate bonds, is left for further, more detailed research.
*None of which is to suggest that the non-economic reasons aren’t valid. But credit scores deal with how you manage credit, and how you manage credit has to do with the options you have as much as the choices you make. Homeowners have fewer options on the allocation of funds to lodging than renters do.
**New York State and Ohio are in the middle range.with 46.8% and 52.0% current, respectively.
***Only Hawaii had tighter FICO standards than California—and they have the second-highest (worst) level of non Owner-Occupied Subprime loans (and the worst of any area with more than 10,000 subprime loans outstanding), while California is fifth-best (lowest) in that metric.