Freaking Out About Outliers: When the Polls Are Way Off

The idea of such a small number of people being used to predict how millions will vote sometimes irks observers, but it's actually a very reliable process—most of the time.
(Photo: Nforest/Shutterstock)

(Photo: Nforest/Shutterstock)

This past week saw some pretty dramatic polls in a series of close statewide races across the United States. In Colorado, Democratic incumbents Senator Mark Udall and Governor John Hickenlooper, who have each maintained narrow leads over their Republican challengers in recent weeks, suddenly appeared to be trailingsignificantly in new Quinnipiac polls. In the Iowa Senate race, almost the same dynamic played out, with narrowly leading Democrat Bruce Braley suddenly trailing by six points to Republican Joni Ernst in a poll by—again—Quinnipiac. What's going on here? Has there been a sudden shift in favor of Republican candidates?

Not likely. If anything, the forecast models for Senate races this fall have moved slightly in the Democratic direction recently. No, what we saw in the last week was a handful of outlier polls.

How do outliers happen? Not to get too bogged down by the specifics of polling, but the quick version is that polls are estimates of what the actual population is thinking, based on a sample of a few hundred or just over a thousand people. The idea of such a small number of people being used to predict how millions will vote sometimes irks observers, but it's actually a very reliable process, provided that the sample being polled is representative of the total population of voters. That can be achieved through randomization and/or some methods pollsters use to weight their data to look like the demographic groups that typically show up in an election.

But the results of a poll are always estimates, and they come with a "margin of error" (usually ± three or four points) that gives us an idea of how reliable they are. So if we get a poll that says that 43 percent of Coloradans plan to vote for Mark Udall, ± four points, that actually means that there's a 95 percent chance that the true percentage of Coloradans who plan to vote for Udall lies between 39 and 47 percent. By extension, that also means that roughly five percent of the time (one out of 20), we'll get results that are wildly wrong, just by luck of the draw. That is, a polling firm can be doing everything right (and Quinnipiac, in particular, has a solid reputation, if a slightly rightward lean), but if they conduct enough polls, they'll occasionally get bad results.

That's not necessarily catastrophic for the polling firms or the political observers who rely on them. We're lucky to live in a time when there are websites (such as Pollster, Real Clear Politics, and others) that aggregate all these polls and average them, allowing us to observe trends and not be misled by outliers. But outliers are, by definition, different from the norm, which makes them newsworthy. So a great deal of ink was spilled over the past week about how these races were tightening up and ads and speeches were having big effects, when in reality basically nothing had changed.


Recent polling in the Udall (blue) v. Gardner (red) Colorado Senate race. (Source: Denver Post)

The campaigns made hay of these results, as well. The suddenly allegedly trailing Democrats saw this as an opportunity to recruit volunteers and solicit donations from complacent supporters. The suddenly allegedly leading Republicans saw this as a chance to boost their backers' spirits and put out new solicitations for money and precinct workers. In these ways, polls can sometimes end up creating changes instead of detecting them.

But the watchword for serious political observers is to ignore individual polls and keep your eye on the averages. Outliers can be great for ginning up support and/or panic, but they're terrible for helping you predict the future.