Wednesday, October 23, 2019

Interpreting Election Polling: Pay Attention to Margins of Error

I have to confess--I had absolutely no doubt Clinton would win the 2016 presidential election. Despite months of working with datasets about state-level polling, despite teaching statistics and sociological research methods, and closely following several sites that monitor state-level polling, I had no doubts. Zero doubts. However, in hindsight, it is clear I was not using my statisticians hat when I made that prediction--I was relying on intuition and emotion.

Probability gave Clinton the odds to win (according to 538, their last estimate was 71-29 for Trump, one of the most conservative estimates by news & polling sites). However, as any gambler knows, the odds are just that--odds, chances, possibilities. If you flip a coin that's weighted to come up heads 71% of the time, it is likely to come up tails 29% of the time. A 71% chance of winning is reasonably good odds, but clearly not a sure thing. An in the case of election polling, those odds can fluctuate from week to week based on current events and pubic mood. In fact, as of the end of September, 538 only gave Clinton 56-44 odds of winning.

Ironically, in my classes, as the campaign season heightened in late 2015 & early 2016, I warned students about the perils of ignoring the tedious details of polling results. I specifically showed the students a bar-graph of a respected poll comparing Trump vs Clinton--it looked like Clinton was far ahead... until you added in the confidence intervals. In the first graph below, the blue bar is the Clinton estimate, and the red is Trump, showing results from a Michigan poll. The pollster is Public Policy Polling (538 gives them a B+ rating). In this poll, showing Clinton clearly has a 5% lead over Trump. However, the margin of error for this poll (once you read the fine print) is 3.2%. That means the Trump estimate, shown at 41%, could actually be 3.2% above or below that, or a range of 37.5-44.2% (with 95% confidence). Similarly, Clinton, shown at 46%, could be anywhere from 42.8-49.2%. Notice on the graph the lower bar for Clinton overlaps with Trump's upper bar. This is what we call overlapping confidence intervals. In practice, this means they are at a statistical tie. So while I was teaching this to my students, I myself ignored my own advice. Clearly a case of "do as I say, not as I do."

Bring this idea to the national level. Many (especially Democrats, and clearly Clinton) believed Michigan, Wisconsin and Pennsylvania were a so-called 'blue-wall' that would definitely vote for a Democrat for president. Clearly this was not the case. In fact, their 46 electoral votes would have swung the election in Clinton's favor. Instead, they all went for Trump (although barely--none voted for Trump by more than 1%, and for all three states combined, it was less than 78,000 voters total that swung the vote for Trump). The graph below represents polling from just three states (MI, WI, PA), that 538 gives a B or higher quality rating, conducted in November, where the margin of error was reported. I include the raw data, and graphed the results with error bars. Contrary to conspiracy theorists, proposing there was election result interference, claiming the election results were all so far from polling estimates, the polling was actually all within the margin of error between Clinton and Trump. The only exception is one Wisconsin poll by PPP--while those error bars do not overlap, they are very close, which should make any skeptical poll watcher nervous. What does this tell us about 2020? When interpreting election polls, never ignore the confidence intervals (and don't let your gut feelings pull you away from the data).