Candidate Specific Response Bias in Polls

Low response rates are a problem for pollsters. The worst problem is candidate specific response bias in which supporters of one candidate are more likely to respond than supporters of another. This can make polls worthless. It is interested to the other very hard problem of predicting who will actually vote.

I am thinking of something a friend told me about 2012. Obama’s support dropped dramatically after the first debate (and this is clearer with the campaigns gigantic sample polls). You said you talked to one of Obama’s pollsters and she told you that they concluded this was due to changing responce rates to the poll — that people who supported Obama didn’t want to talk about politics.

I have been thinking about how to determine this using the rich data from voter registries which they had. I had a thought. I think it might be possible to get useful information about response rates and bias by polling as well as possible and also badly. It is hard to make the problem smaller, but easy to make it larger.

The idea is that with comparitively weak assumptions, one can figure out response bias by looking at how much larger it is with the worse interview approach. So do both human direct dial and robopoll (machine call) & compare response rates as a function of observed characteristics *and* support for candidates among respondents. Or human calls & either says thank you and hangs up at first resistence, or read a brief script asking people who say they are busy to please participate in a special super brief one minute poll.

Without this I think they had to make strong assumptions about how the disturbance to the participation probit and the support Obama regression are jointly normal. I might be ignorant, but I think this is true.

I think such different interview scripts (and human vs machine) might also be useful for predicting who will actually vote. The first election it is tried, it can only be compared to people claiming they will certainly vote, but with actual turnout data (votes are secret — who voted is not secret) things could be improved.

The point, if any, is that campaigns have huge resources compared to independent pollsters who publish results — just polling a huge sample is not necessarily the best way to use those resources.