Can We Trust Political Polls and Why Do We Still Use Nielsen Ratings?

The answers to these questions have a common origin that takes us back 80 years to 1936. Commercial consumer and social research was a new business then – AC Nielsen had been founded 13 years previously (in 1936 it was about the same age as Facebook is now) and Gallup inc. was just one year old. As it turned out, events in 1936 would soon make Gallup front-page news.

Like 2016, 1936 was a general election year in the US and there were opinion polls. One organization that considered itself expert in this field was The Literary Digest, a magazine that had been in the polling business since 1916.

The Literary Digest was on a streak – it had correctly predicted the election of Presidents Wilson (1916), Harding (1920), Coolidge (1924), Hoover (1928) and Roosevelt (1932). These results were based on a simple technique – they mailed out millions of postcards and counted the results from the returns. Big Data early 20th century style! Confident in this method as a result of their impressive track record, they repeated it in 1936. They predicted the winner to be Alf Landon, the Republican Party candidate. Many readers will probably now be thinking “who?”

Meanwhile, Dr. Gallup was approaching the problem in a different way. Instead of obtaining millions of returns, his organization selected and interviewed a smaller sample of 50,000 voters, designed to represent US voters of all types. The Gallup Poll predicted that FDR would be re-elected for a second term.

Gallup was right and The Literary Digest was wrong…badly wrong, as Roosevelt won 46 of the then 48 states. The magazine never really recovered – advertisers pulled out and it closed in 1938. This cover from November 1936 says it all.

literary-digestBut why did it get it so wrong? The problem was fairly obvious in hindsight. Although it had millions of returns, The Literary Digest polled mailing lists comprising its own readers, registered automobile owners and telephone users. This was the Great Depression and anyone who could afford a magazine subscription, a car or even a telephone was likely to be wealthier than average and therefore more likely to vote Republican.

In short, The Literary Digest method was biased, proving that bigger isn’t always better. The Gallup Poll was designed to equally represent all types of voters and because of this, even though the sample size was much smaller, the results were much closer to the truth. Of course, with smaller sample sizes there is the possibility of sampling error (results varying by chance due to who is selected in the poll). In this case, the statistical margin of error on the 54% claiming to be FDR supporters was only around 1%.

The success of this design set the course for social and consumer research thereafter: representative samples, properly selected and projected to the population became the accepted way to provide intelligence about consumers. In the last decade, this approach has been challenged by the advent of big data sources and the lessons of 1936 have become relevant again.

And so, turning to TV audience measurement, we see a parallel with this discussion. Nielsen measures national linear TV audiences using a sample, a panel that is recruited to represent all US TV households and continuously updated to maintain its relevance. The current sample size is 35,000 homes containing about 100,000 persons. For more detail on the various elements of Nielsen’s TV measurement in the US see here.

In recent years, other insights into TV usage have become available via set-top box (STB) data and Smart TVs (the “Big Data solutions”). These insights are based on millions of boxes returning data. ComScore/Rentrak and TRA have emerged with audience measurement products using these data. And the owners of the data are creating advertising solutions that activate these data. So with millions of boxes and TVs providing data, why is Nielsen still the currency for linear TV?

There are a number of reasons, and the lesson of 1936 is an overarching reason: bigger isn’t always better. Some of the differences between the Nielsen approach and the Big Data solutions illustrate why:

  1. The STB and Smart TV data do not cover all US TV households equally. Hispanic households (especially Spanish speakers), for example are less likely to have cable or satellite subscriptions. Univision and Telemundo audiences are therefore likely to be under-represented by these data. And aggregators such as ComScore and TRA do not have access to all devices (eg Comcast and Time Warner data). Nielsen represents all TV households, regardless of technology.
  2.  The Big Data solutions do not typically have complete data about household members – eg age/gender, race, origin, income. This information can be appended from other data sources but these solutions are not 100% complete or accurate. Because Nielsen directly recruits households in person it gets a very a complete and accurate record of household composition.
  3. The Big Data solutions reflect device activity, not persons’ viewing. Advertisers want to reach people and with televisions, co-viewing on a device is still common. Nielsen provides persons’ audiences whereas STB and Smart TV usually don’t – and where it does, there are often concerns about privacy. In addition, Nielsen measures all TVs in the home whereas Big Data solutions are restricted to devices with the appropriate permission and technology to return activity data for analysis.
  4. The device activity obtained from Big Data solutions may not reflect tuning or viewing. People leave cable boxes on when they turn the TV off so the data from the box has to be modeled or edited to address this.

The comments above are not intended to disparage Big Data solutions but to illustrate why Nielsen remains the main currency for linear TV. Additionally, Nielsen has the incumbent advantage and is also MRC accredited, giving confidence to the users of the data.

It should be noted that for smaller audiences (eg niche networks and local TV audiences), Big Data solutions can provide more stable and usable insights than Nielsen panel data, where sample size limitations can result in sparse reported ratings. The industry is indeed using these data for those applications. The best measurement of TV audiences would be an integration of all available and relevant data – a hybrid of panel data to provide total coverage and persons’ audiences and Big Data to provide more granular detail than the economics of panels allow.

So as we approach November, can we trust political polls? The answer is a cautious “yes, but be aware of their limitations”. There are many reasons why polls may not predict the actual result of an election:

  1. Sample selection bias – as The Literary Digest found out in 1936.
  2. Response bias – people who respond may be more likely to be predisposed to one party or the other.
  3. Measurement error – questions may lead respondents, or respondents may respond in a way that they feel is socially acceptable even if they don’t hold the “acceptable” view.
  4. Sampling Error – the smaller the sample, the higher the margin of error in the results.
  5. Real change between the time of the poll and the election. You won’t always find out the winner of a horse race by taking a photo 100 yards before the finish line.

Keep these limitations in mind when you hear poll results. And remember, politicians only talk about the polls that show them in a good light, while news shows and networks look for fluctuations in polls so that they can weave a story, and improve their audiences, as measured by Nielsen.

Leave a Reply