Polls and Reporting IV CNN and the Perils of Panel Attrition

It is generally agreed that it was generally perceived that Mitt Romney won the first 2012 Presidential Debate.  One datum (two data ?) stood out however.  The CNN/ORC instant poll (pdf warning) in which respondents judged Romney the winner 67% to 25%.  This contrasts with a CBS instant poll (warning video) which offered the option to declare it a tie where 46% declared Romney the winner vs 22% for Obama.  A 42% gap is very different from a 24% gap.  What happened ?

update: See updates here http://talkingpointsmemo.com/archives/2012/10/were_trying_to_figure_this_out.php

Well ORC for CNN did something very interesting and useful which lead to a misleading flash report from CNN.  They conducted a panel study in which the same people answered the same questions twice — once before the debate and once after the debate.

The comparison of the different responses by the same people is very interesting.  While respondents were disappointed by Obama his favorable/unfavorable rating changed as little as such ratings rounded to % can 49/49 to 49/50.  Given the sample size of 430 US adults this would happen if 3 to 5 people with neither opinion switched to having an unfavorable opinion (assuming no shifts towards more favorable). Romney’s favorable rating also barely changed from 54% to 56% (this would happen if five to ten respondents switched from no opinion to a favorable opinion).

But wait *before* the debate Romney had a favorable rating of 54% ??? That must be in the very upper tail of all US wide polls ever.  Why does the poll have such an unrepresentative sample ?

Ah panel data.  Can’t live with it, can’t live without it.   The sample design follows after the jump

Survey respondents were first interviewed as part of a random
national sample on September 28-October 2, 2012.  In those
interviews, respondents indicated they planned to watch
tonight’s debate and were willing to be re-interviewed after the

Of course some people who said they were willing to be re-interviewed didn’t answer the phone and maybe one or two changed their minds and refused.  But The sample selection is extraordinary.  This matters very little if one sticks to looking at changes from before the poll to after (as in the favorable unfavorable ratings which barely changed at all).  It matters a lot if one mistakes the post debate sample for a representative sample of US adults (as CNN did when they headlined the who won the debate results).

As noted somewhere by, IIRC Josh Marshall, the sample is absurd.  This is clear from the results for subsamples — one of the most common results is N/A (not available) because there were so few respondents in the subsample.   CNN/ORC decided not to report results for tiny but not completely empty subsamples so it isn’t clearly just how few.

So for exampl among Southern respondents Romney was rated the winner 71 % to 22%.  Estimates of opinions in the West, Mid West and North/East were all not available, because there were so few respondents.  the most extreme possibility is that there were 12 respondents outside of the South all of whom thought Obama won.  A plausible explanation of the results is that there were 17 respondents outside of the South who said Obama won and 17 respondents who said Romney won (if my arithmetic is correct which I mean come on it’s better than picking a random number)

Eve n more impressively results of respondents under 50 years old are not available and the results for respondents over 50 are 67% Romney won 24% Obama won.  This is consistent with there being exactly one respondent under 50 who said Obama won, although it is much more likely that there were a small plural number  under 50  respondents who split more or less as old* respondents did.

Basically the reinterview sample consists almost entirely of old  non liberals from the South who have attended college.  So the results headlined by CNN are completely uninteresting.  I am certain that this just shows how most people don’t understand how to use panel data.  I think it is important for citizens in general to understand how to analyse panel data, so I think this post might be worth posting.

*I’m 51 so I can type that.