Did data science predict the US election outcome?

3 min read

As the drama of the US presidential race winds down for another four years, electoral forecasts have been ongoing, as the results have trickled in from the last of the states. Our Principal Data Scientist Dr James McKeone shares his knowledge and thoughts on the subject.

There have been two notable forecast models involved in the 2020 election race; the fivethirtyeight 2020 election forecast model lead by Nate Silver, which had Biden at 89% probability to win and the modelling team lead by Andrew Gelman and Elliot Morris building the US potus model for The Economist, which had Biden to win at 97% probability.

For me having never followed a US election before and not realising all that comes into play with the electoral college votes, the state-by-state differences in practice and the media circus – it didn’t feel like such a certain outcome for Joe Biden as the votes came in.

As there are only two possible outcomes to the US election race, it’s much easier to apply a pass/fail mark to any forecast that predicts the correct or incorrect result on the balance of probabilities. This is shown in such a way that isn’t so obvious for other predictions which we widely rely on, such as the weather or some economy forecasts. For anyone seeking a lesson in interpreting probability and understanding uncertainty in forecasts, Andrew Gelman’s blog is a masterclass in forecast calibration and statistical reasoning from the Bayesian perspective.

It’s interesting to see in both the fivethirtyeight and The Economist models, the post-election analysis by each team now that the majority of the results have come in. For instance, it seems that, as was the case in 2016, the pre-election polling data is a potentially biased, but also can be seen on the ‘messed up polls’, Biden’s predicted win and a post-election update from Gelman’s team together with comments on their final election update, Biden’s victory and on exit polls from Silver’s team.

This questioning of results, being made even before the dust has settled on vote counts for all states, is a critical part of true forecasting models that are so often glossed over in industry applications of data science. Where the model is not only tested out-of-sample but assessed from the most fundamental elements:

  • Model specification
  • Data input
  • Interpretability by the end-user.

In my experience, the careful review of these three areas are what separates an out-of-the-box, elementary model-to-get-an-answer build and a model built to be built upon, a living forecast model that is built and re-built, torn down and re-fit, with results presented in ways that the user cares about and understands.

Author

Stay in the loop
Loading
Share post

hi

Other posts you might like

How to plan your first-party data strategy: A beginner’s guide

How to plan your first-party data strategy: A beginner’s guide

Privacy regulations and the phasing out of third-party cookies have made it more important than ever to establish a solid
State of Marketing Reporting: 75% of Marketers Sack Agencies for Poor Reporting

State of Marketing Reporting: 75% of Marketers Sack Agencies for Poor Reporting

Transparency and actionable insights have become non-negotiables for businesses partnering with marketing agencies. Recent research by ASK BOSCO®,
Travel companies who are flying at the start of 2025

Travel companies who are flying at the start of 2025

2025 ins and outs? Travellers are set to embrace trends like solo adventures, blended work-and-leisure trips, and sustainable tourism, the

Popular topics

[other_categories]