Let's actually run through the exercise. Here, from the RealClearPolitics polling aggregator, are the seven Ohio polls that, as of this morning, had been published in November:
The total sample size is 7,568. That means the margin of error (as conventionally calculated, with a 95-percent confidence level) is somewhere around 1 point. If you average the seven numbers together, you get an Obama lead of 2.9 points. But that's a crude average. If we're going to be precise we should weight each poll according to its sample size, which gives us an average of 3.1 points.
So, collectively, polls conducted in Ohio recently have a (95 percent confidence level) margin of error of somewhere around 1 point and give Obama a lead of 3.1 points. That's no toss-up, even according to the loose definition of toss-up used by journalists who report on opinion polls. In fact, with a lead this big and a sample size this big, our confidence that Obama is ahead in the Ohio voting population at large is around 99 percent. Feel better, Obama supporters?
Not so fast! Here's what should shake your confidence in these polls:
All the statistical reasoning described above is premised on the assumption that the sample population is a random sample to begin with. Now, it's true that it's possible, in principle, to get a truly random sample. For example: If you have a jar full of red and blue jelly beans, all mixed up, and you're taking jelly beans out of the jar while blindfolded, you get a truly random sample. The larger your sample, the more likely it is to reflect the ratio of red to blue jelly beans in the jar as a whole -- and the exact likelihood can be accurately determined by the math alluded to above. In a situation like this, the probabilistic margins of error that pollsters calculate will be entirely reliable.
But pollsters don't pick jelly beans out of a jar. They call jelly beans on the phone (or, try to reach them online, or, as in the case of the Dispatch poll, solicit responses by mail). And for all they know, blue jelly beans are less likely to be near a phone than red jelly beans -- or less likely to answer the phone, or less likely to agree to be polled after they answer the phone, or whatever.
This is a very challenging problem, in part because the technological landscape is changing so fast that it's hard for pollsters to use their experience from the last presidential election as a basis for refining their methodology. Among the things that presumably have changed since 2008: the number of people who have cell phones, the number who have abandoned land lines in favor of cell phones, the number who have caller ID and use it, the number who ignore calls from unknown parties, etc. And these kinds of things tend to vary by age, income level, ethnicity, etc. -- all of which correlate with which candidate a person will vote for. Pollsters can do things to try to correct for all of this, but the ground is shifting so fast that it's hard for them to know they're doing the right things.
So I'm sympathetic to people who say the polls can't be trusted. But if the polls can't be trusted, then the logic behind that Columbus Dispatch headline -- and the story underneath it, and the poll the story is based on -- starts to fall apart. The premise of doing a poll and writing about it, and putting it under a banner headline on your front page, is that polls in some meaningful sense can be trusted. And if polls can be trusted, there's no basis for calling Ohio a "toss-up" in the sense that we normally use that term.
[Postscript: I've assumed that the Dispatch pollsters, in calculating their 2.2-point margin of error, are following the most common convention and using a 95-percent confidence level. But the poll's fine print doesn't say. And my sense, from nosing around online, is that 2.2 is actually a bit low for a 95-percent margin of error when your sample size is 1,500 -- though, on the other hand, it sounds slightly high for a 90-percent margin of error. But, as I said, my math is a bit rusty, so I'll hope some whiz-kid commenter clarifies the situation for us. Anyway, the basic morals of this story hold regardless of how exactly the Dispatch is defining its margin of error. And, in any event, I'm pretty sure I'm on solid ground in saying that Obama's 2-point lead, given a sample size of 1,500, implies a probability of roughly 90 percent -- give or take a few points--that he is ahead of Romney in the voting population at large (or would be, if the sample were truly random).]