Methods: In Polling, Bigger Is Not Necessarily Better
Political campaigners can often be heard complaining that opinion polls do not reflect what they hear on the doorstep. Arguing that they have spoken to many more people than the 1,000 or so typically interviewed for a poll, they claim the polls must be biased or just plain wrong.
In Scotland the Radical Independence Campaign has carried out several mass canvasses in which its activists have contacted more than 5,000 households. After undecided voters are excluded, they report a majority for yes by around 60 percent to 40 percent against. This is almost the polar opposite of the picture presented by the polls. The most recent “poll of polls”, based on an average of the previous six polls published, puts yes support at 43 percent and no at 57 percent.
Lessons from America
So who is right? To understand why bigger isn’t always better, it’s worth telling the story of George Gallup and the Literary Digest magazine. In 1936 the Literary Digest carried out a straw poll of 2.4 million people to find out how they planned to vote in the U.S. presidential election. It confidently predicted that Alfred Landon, the Republican candidate, would win by some margin.
George Gallup’s American Institute of Public Opinion predicted that Franklin D Roosevelt would win, based on a much smaller sample of around 5,000. In the end, of course, Gallup was proved correct – Roosevelt was returned to the White House with 61 percent of the vote. And Gallup’s success was in part responsible for the more widespread adoption of modern opinion-polling techniques. The Literary Digest, meanwhile, went out of business shortly after.The reason Gallup got it right and the Literary Digest got it wrong, in spite of its far bigger sample size, lay in the nature of the two samples. The Literary Digest primarily polled its own readers as well as people on automobile registration lists. As a result, its sample was heavily biased towards those on higher incomes. The response rate to the survey was also very low – more than 10 million mock ballot papers were sent out to achieve those 2.4 million responses.
On the other hand, Gallup set quotas for the number of interviews for different demographic categories like men and women, people on low and high incomes and so forth. Each quota was based on what was known about the actual profile of the population as a whole. This meant that the final sample was far more representative – it looked like the population whose views it was meant to reflect.
Poll the other one
The gold standard of survey research is often held to be probability sampling (which ScotCen and our sister organization NatCen use on the Scottish and British Social Attitudes surveys). This method selects participants at random from a list that includes all (or almost all) those people your sample is meant to represent. This list is sometimes “stratified” or ordered to ensure that people with different characteristics are included in proportion to their prevalence in the larger population. By giving everyone a chance of being included and then selecting them at random, probability sampling arguably remains the best way of ensuring a representative sample.
Other opinion polls currently being conducted in Scotland use methods such as quota sampling (TNS BMRB), random digit dialing (Ipsos MORI), and what might be termed stratified “volunteer” sampling (YouGov, Panelbase, Survation and ICM, who all conduct online polls).
As the name suggests, quota sampling involves having target numbers of respondents in particular demographic categories with a view to creating a representative sample of the overall population. Random digit dialing involves contacting (usually large numbers of) land-line numbers on a random basis. Interviewees may then be selected to meet quotas to ensure the achieved sample includes people with specific characteristics to reflect the population as a whole.
Stratified volunteer samples are used for online surveys, whereby the sample is selected from a large database of people who have volunteered to take part. The issued sample is typically stratified to try and ensure a balance of respondents of different ages, genders and so on. Most surveys and polls then apply weighting to the achieved sample to try and correct any imbalance in the demographic profiles compared to the population as a whole.
It is true that none of these methods are immune from criticism. Probability samples suffer from lower response rates than they once achieved (although most surveys based on other kinds of samples do not quote any response rate at all). Quota samples might control for certain characteristics, such as age, gender and past vote, but can easily over or under-represent characteristics for which quotas are not set – resulting, for example, in a more politically interested sample than the population as a whole.
But whatever your views of the methods of a specific survey or poll, the Literary Digest/Gallup story clearly illustrates the merits of a more scientific approach to sampling and the dangers of assuming that simply by speaking to lots of people you will get an accurate measure of what a population as a whole is thinking
Volunteer samples may diverge from the population as a whole even further: by definition, volunteer panels are comprised of people who are more interested in taking part in surveys than people selected at random from the population. It is also the case that the polls for the referendum show far more variation than is typical for election polls, as my colleague John Curtice has discussed here and elsewhere before.
But whatever your views of the methods of a specific survey or poll, the Literary Digest/Gallup story clearly illustrates the merits of a more scientific approach to sampling and the dangers of assuming that simply by speaking to lots of people you will get an accurate measure of what a population as a whole is thinking.
Spoil on canvass
Information about the methods adopted by political canvassers is often thin on the ground. But typically it involves sending as many partisan activists as are available to particular streets or areas and encouraging them to knock on as many doors as possible. The aim is usually to identify supporters who may need encouragement or help to get to a polling station on voting day.
Unlike pollsters, canvassers are not set quotas which would help them achieve a sample that is representative in terms of age, gender, working status, region and so on.
We might also ask whether people are likely to give a truthful answer to someone who may be wearing a badge or t-shirt that makes their own voting intentions clear. Interviewers for polling and survey companies, in contrast, are trained to be scrupulously neutral and to avoid giving anything away about their own views.
Mass canvasses may well be a useful tool to mobilize campaigners to get their messages across. But as a mechanism for gauging the state of opinion in the population as a whole, they are far less reliable. In interpreting their findings, we should remember that a badly designed large sample tells you far less than a well designed small sample. In other words, bigger is not always better.