Data Science

Did data science predict the US election outcome?

Updated 11th Nov, 2020

3 min read

As the drama of the US presidential race winds down for another four years, electoral forecasts have been ongoing, as the results have trickled in from the last of the states. Our Principal Data Scientist Dr James McKeone shares his knowledge and thoughts on the subject.

Forecast models

There have been two notable forecast models involved in the 2020 election race; the fivethirtyeight 2020 election forecast model lead by Nate Silver, which had Biden at 89% probability to win and the modelling team lead by Andrew Gelman and Elliot Morris building the US potus model for The Economist, which had Biden to win at 97% probability.

For me having never followed a US election before and not realising all that comes into play with the electoral college votes, the state-by-state differences in practice and the media circus – it didn’t feel like such a certain outcome for Joe Biden as the votes came in.

The outcomes

As there are only two possible outcomes to the US election race, it’s much easier to apply a pass/fail mark to any forecast that predicts the correct or incorrect result on the balance of probabilities. This is shown in such a way that isn’t so obvious for other predictions which we widely rely on, such as the weather or some economy forecasts. For anyone seeking a lesson in interpreting probability and understanding uncertainty in forecasts, Andrew Gelman’s blog is a masterclass in forecast calibration and statistical reasoning from the Bayesian perspective.

It’s interesting to see in both the fivethirtyeight and The Economist models, the post-election analysis by each team now that the majority of the results have come in. For instance, it seems that, as was the case in 2016, the pre-election polling data is a potentially biased, but also can be seen on the ‘messed up polls’, Biden’s predicted win and a post-election update from Gelman’s team together with comments on their final election update, Biden’s victory and on exit polls from Silver’s team.

This questioning of results, being made even before the dust has settled on vote counts for all states, is a critical part of true forecasting models that are so often glossed over in industry applications of data science. Where the model is not only tested out-of-sample but assessed from the most fundamental elements:

Model specification
Data input
Interpretability by the end-user.

In my experience, the careful review of these three areas are what separates an out-of-the-box, elementary model-to-get-an-answer build and a model built to be built upon, a living forecast model that is built and re-built, torn down and re-fit, with results presented in ways that the user cares about and understands.

Author

Rosie Beeston

View all posts

By Rosie Beeston

Agency

Retailer/Brand

Got a minute? See what we do in just 60 seconds

See businesses succeed with ASK BOSCO®!

Platform

Platform Walkthrough

Experience the power of ASK BOSCO® firsthand

Got a minute? See what we do in just 60 seconds

News & insights

Videos

Events & webinars

Success stories

ASK BOSCO® vs alternatives

Downloads

See ASK BOSCO® in action

About us

Our team

Careers

Got a spare minute?

Onboarding

Knowledge hub & training

Frequently asked questions

See ASK BOSCO® in action

Data Science

Did data science predict the US election outcome?

Forecast models

The outcomes

Author

Stay in the loop

Share post

Other posts you might like

Google launches the Universal Commerce Protocol (UCP) in the US

Digital news to watch: Google Gemini launch delayed as tech falls short of internal goals

The top 7 ecommerce budget planning errors in analytics (and how to fix them)

Popular topics