class: left, top, title-slide # Statistical Models of Presidential Elections ## Bayesian inference using polls and the fundamentals ###
G. Elliott Morri
s
Data journalist
The Economist
### <class=‘date’>October 9, 2020
Prepared for a talk at the IU-Bloomington Workshop on Methods --- # 2020 presidential election forecast* <br> <img src="figures/economist_forecast_headline.png" width="100%" /> <br> _*as of October 8 at 7:05 PM_ --- # Our model ### 1. National economic + political fundamentals ### 2. Decompose into state-level priors ### 3. Polls Uncertainty is propogated throughout the models, incorporated via MCMC sampling in step 3. --- class: center, inverse, middle # National Fundamentals --- # What fundamentals? #### i) Index of economic growth (1940 - 2016) - eight different variables, scaled to measure the standard-deviation from average annual growth #### ii) Presidential approval (1948 - 2016) #### iii) Polarization (1948 - 2016) - measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth #### iv) Whether an incumbent is on the ballot --- <img src="figures/fundamentals_economy.png" width="80%" /> --- <img src="figures/fundamental_approval.png" width="80%" /> --- <img src="figures/fundamentals_with_incumbency.png" width="100%" /> --- # National fundamentals ### Model formula: vote ~ incumbent_running:economic growth:polarization + approval ### Training Model trained on 1948-2016 using elastic net regression with leave-one-out cross-validation <img src="figures/train-validate-test.png" width="90%" /> RMSE = 2.6 percentage points on two-party Democratic vote share --- class: center, inverse, middle # State-level prior --- # State-level prior #### i) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016 * Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state #### ii) Use the covariates to make predictions for 2020, _conditional on the national fundamentals prediction for every day_ #### ii) Simulate state-level outcomes to extract a mean and standard deviation * Propogates uncertainty both from the LOOCV RMSE of the national model and the state-level model --- class: center, inverse, middle # Pooling the polls --- # It's just a trend through points... <img src="figures/state_space_model.png" width="90%" /> --- # (...but with some fancy extra stuff) ``` mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior; for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i]; national_mu_b_average = transpose(mu_b) * state_weights; mu_c = raw_mu_c * sigma_c; mu_m = raw_mu_m * sigma_m; mu_pop = raw_mu_pop * sigma_pop; e_bias[1] = raw_e_bias[1] * sigma_e_bias; sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias; for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho; //*** fill pi_democrat for (i in 1:N_state_polls){ logit_pi_democrat_state[i] = mu_b[state[i], day_state[i]] + mu_c[poll_state[i]] + mu_m[poll_mode_state[i]] + mu_pop[poll_pop_state[i]] + unadjusted_state[i] * e_bias[day_state[i]] + raw_measure_noise_state[i] * sigma_measure_noise_state + polling_bias[state[i]]; } ``` --- # Poll-level model -- #### i. Latent state-level vote shares evolve as a random walk over time * Pooling toward the state-level fundamentals more as we are further out from election day -- #### ii. Polls are observations with measurement error that are debiased on the basis of: * Pollster firm (so-called "house effects") * Poll mode * Poll population -- #### iii. Correcting for partisan non-response * Whether a pollster weights by party registration or past vote * Incorporated as a residual AR process --- class: center, inverse, middle # Tying it all together --- # Tying it all together ## 1. 2016 election-day forecast: <img src="figures/trends_2016.png" width="80%" /> --- # Tying it all together ## 2. 2020 forecast*: <img src="figures/trends_2020.png" width="80%" /> _*As of October 9th, 2020_ --- class: center, inverse, middle # Q&A --- class: center # Thank you! <br> <br> #### Website: [gelliottmorris.com](https://www.gelliottmorris.com) #### Email: [elliott@thecrosstab.com](mailto:elliott@thecrosstab.com) #### Twitter: [@gelliottmorris](http://www.twitter.com/gelliottmorris) <br> <br> <hr> _These slides were made with the `xaringan` package for R from Yihui Xie. They are available online at https://www.gelliottmorris.com/slides/2020-10-09-indiana-bloomington-methods-workshop/_