Friday, July 1, 2016

Predicting the Presidential Election from Polling

The pundits are currently saying that national-level polling about the presidential election is unreliable this far out--the election is just over 4 months away. However, state-level polling can be useful. For the first part of my analysis below, I looked at the polling from the 2012 election between Obama and Romney up to July 15 to see how well their results conformed to the actual election for that state.

For my methodology, I decided to compare three polls per state. The polls had to be between May-July 15, they had to be of "likely voters," and the sample size had to be above 500. I used data from the site Real Clear Politics which has polling data going back several elections for each state. In some cases, like Virginia, there were many polls conducted between that time frame, and more than 3 that polled only "likely voters." In that case, I used the 3 polls closest to, but before, July 15. The one exception to these criteria is that I always included PPP's results as one of the 3 polls. According to a study by a Fordham political science professor, PPP had the best polling results for that election.

For the 2012 election, there were only 15 states where the difference between Obama and Romney was less than 10%--I considered only those 15 states in this analysis. As can be seen from the table, in 2012, Obama won 11 of those 15 states, and Romney won only 4. The 3rd column, labelled "PPP Poll: Date" is of the listed PPP poll, and then the results for Obama, Romney, and the difference between them. If Obama polled higher, the "Obama-Romney Difference" column is blue. If Romney polled higher, it's red. To the right of that is the second poll used, with the date the poll was completed, and their results. In the furthest right section is the 3rd poll and their results. If there are blanks, it means there were not enough polls that met my criteria to include them in the list, so some states (Nevada and Minnesota) have only the PPP poll. Two other states (Missouri and Georgia) have only 2 polls.

Using this methodology, looking at the 3 polls for each of these 15 states, if at least two polls agreed on a winner, they did in fact correctly predict the winner for that state, even as early as the May/June/July polling. For these 15 states, the pre-July 15 polling by PPP only got 2 of these states wrong: Missouri and North Carolina, and both were within the margin of error, so were statistical ties (for simplicity of presentation, I did not include margin of error in the table). Using this as a guide, I propose that this methodology is reasonably useful to predict the 2016 election.

I followed the same methodology described above to collect polling data so far from Real Clear Politics. There aren't nearly as many state-level polls this year as in 2012. This could partly be because my 2012 method allowed polls up through July 15, and many of the above polls were, in fact, from early July--I am currently writing this on July 5th, which may explain the relative lack of polls. The table below shows the results

In order to win the presidency, the candidate must reach 270 electoral votes. If we assume that Clinton will win all of the 15 states (and DC) that Obama won by more than 10%, that gives her 191 electoral votes. If we assume that Trump will win all of the 20 states that Romney won by more than 10%, that gives him 154 electoral votes. If we then look at the 2016 polling data, and give any state where at least 2 polls agree on a candidate to that candidate as a win, then so far one state is going to Trump (GA), and 4 states are going for Clinton (OH, NH, IA, and WI). That puts Clinton at 229 electoral votes, 41 short of what she needs, and Trump at 170, 100 from what he needs.

Let's assume that MO & AZ go to Trump (Obama lost those by more than 9% in 2012), and MI & MN to to Clinton (Obama won those by more than 7.5%). Clinton is at 255, while Trump is at 191. In this scenario, the only real "battleground states" left are NC, FL, VA, CO, PA & CO. If Clinton wins either PA or FL, and Trump wins all of the other states, then Clinton still wins the election. Or if Clinton wins VA+NV or VA+CO, then Clinton wins the election. As of July 5, 538 (Nate Silvers) is predicting that Clinton will win every single one of those states (MI, MN, NV, PA, CO, VA, FL & NC), with Trump winning only MO & AZ.

What is surprising for me is that in an earlier analysis I showed that since WWII, the US likes to switch its presidential parties every 8 years, with the only exceptions being the Reagan-Bush long GOP tenure, and the short Carter Democratic tenure. I also noted that those years had unique economic situations--unusually high/low GDP growth and unemployment rates--that helped to explain these departures from typical election patterns. In our present case, pundits are telling us that dramatic demographic shifts are giving Democrats an advantage this year. But regardless, as we have seen with the success of the Trump candidacy, it is dangerous to predict anything political this year.


Correction: In the first version of this post, I had the incorrect values for the Georgia polls, showing that Clinton was predicted to win there. This has been fixed, and the relevant analysis corrected.

No comments:

Post a Comment