+ - 0:00:00
Notes for current slide
Notes for next slide

Innovations in poll aggregation and election forecasting

Leveraging more poll-level information and the fundamentals

G. Elliott Morris
Data journalist
The Economist

October 19, 2020
Prepared for a guest lecture to Charles Stewart’s class, MIT

1 / 22

2020 presidential election forecast*



*as of October 26 at 12:38 PM

2 / 22

Our model

1. National economic + political fundamentals

2. Decompose into state-level priors

3. Polls

Uncertainty is propogated throughout the models, incorporated via MCMC sampling in step 3.

3 / 22

National Fundamentals

4 / 22

What fundamentals?

i) Index of economic growth (1940 - 2016)

  • eight different variables, scaled to measure the standard-deviation from average annual growth

ii) Presidential approval (1948 - 2016)

iii) Polarization (1948 - 2016)

  • measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth

iv) Whether an incumbent is on the ballot

5 / 22

6 / 22

7 / 22

8 / 22

National fundamentals

Model formula:

vote ~ incumbent_running:economic growth:polarization + approval

Training

Model trained on 1948-2016 using elastic net regression with leave-one-out cross-validation

RMSE = 2.6 percentage points on two-party Democratic vote share

9 / 22

State-level prior

10 / 22

State-level prior

i) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016

  • Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state

ii) Use the covariates to make predictions for 2020, conditional on the national fundamentals prediction for every day

ii) Simulate state-level outcomes to extract a mean and standard deviation

  • Propogates uncertainty both from the LOOCV RMSE of the national model and the state-level model
11 / 22

Pooling the polls

12 / 22

It's just a trend through points...

13 / 22

(...but with some fancy extra stuff)

mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
logit_pi_democrat_state[i] =
mu_b[state[i], day_state[i]] +
mu_c[poll_state[i]] +
mu_m[poll_mode_state[i]] +
mu_pop[poll_pop_state[i]] +
unadjusted_state[i] * e_bias[day_state[i]] +
raw_measure_noise_state[i] * sigma_measure_noise_state +
polling_bias[state[i]];
}
14 / 22

Poll-level model

15 / 22

Poll-level model

i. Latent state-level vote shares evolve as a random walk over time

  • Pooling toward the state-level fundamentals more as we are further out from election day
15 / 22

Poll-level model

i. Latent state-level vote shares evolve as a random walk over time

  • Pooling toward the state-level fundamentals more as we are further out from election day

ii. Polls are observations with measurement error that are debiased on the basis of:

  • Pollster firm (so-called "house effects")
  • Poll mode
  • Poll population
15 / 22

Poll-level model

i. Latent state-level vote shares evolve as a random walk over time

  • Pooling toward the state-level fundamentals more as we are further out from election day

ii. Polls are observations with measurement error that are debiased on the basis of:

  • Pollster firm (so-called "house effects")
  • Poll mode
  • Poll population

iii. Correcting for partisan non-response

  • Whether a pollster weights by party registration or past vote
  • Incorporated as a residual AR process
15 / 22

Debiased predictions

Notable improvements from partisan non-responseand other weighting issues

16 / 22

Debiased predictions

Notable improvements from partisan non-responseand other weighting issues

17 / 22

Tying it all together

18 / 22

Tying it all together

1. 2016 election-day forecast:

19 / 22

Tying it all together

2. 2020 forecast*:

*As of October 26th, 2020

20 / 22

Q&A

21 / 22

Thank you!



Website: gelliottmorris.com

Email: elliott@thecrosstab.com

Twitter: @gelliottmorris




These slides were made with the xaringan package for R from Yihui Xie. They are available online at https://www.gelliottmorris.com/slides/2020-10-26-mit/

22 / 22

2020 presidential election forecast*



*as of October 26 at 12:38 PM

2 / 22
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow