class: left, top, title-slide # The Soup Principle ## How election forecasting models work (and why they don’t) ###
G. Elliott Morris
| December 3, 2021 --- <img src="figures/cover.jpg" width="50%" /> --- <img src="figures/tomato.jpeg" width="80%" /> --- # Why polls? Why forecasting? -- ### 1. Journalism -- ### 1a. Attention -- ### 1b. Truth -- ### 2. Methods training -- ### 3. It's fun! --- # So let's talk about... -- ### 1. How polls work -- ### 2. How forecassts work -- ### 3. And why they fail --- --- class: center, inverse, middle # The soup principle and the polls --- # The first polls <img src="figures/polls_street.jpg" width="80%" /> Straw polls -> real polls --- # The first ("scientific") polls ### - Conducted face-to-face -- ### - Used demographic quotas for representativeness - Race, gender, age, geography, class -- ### - Beat straw polls in accuracy (1936) - By shrinking bias from nonresponse -- ### - But fell short of true survey science (1948) --- # Polls 2.0 ### - SSRC says: area sampling -- <img src="figures/houston.png" width="60%" /> --- # Polls 2.0 ### - SSRC says: area sampling ### - Gallup implements some partisan controls - Stratas are groups of precincts by 1948 vote choice -- ### - Use rough quotas within geography -- ### - Preserve interviewer bias -- --- # Polls 3.0 <img src="figures/phone.jpeg" width="70%" /> -- ### Technological change -> better methods --- # Polls 3.0 ### - True random sampling (for people with phones) ### - Response rates above 70 or 80% ### - Rarer instances of severe nonresponse bias ### - Cheaper to conduct = news orgs poll (CBS, NYT) --- ### Technological change -> worse methods? <img src="figures/pew_response_rate.jpg" width="60%" /> --- <img src="figures/tomato.jpeg" width="80%" /> --- ## The soup principle (in theory) <img src="figures/pew_soup.png" width="100%" /> _Image credit: Pew Research Center_ --- .center[ ## But what if the people you sample don't represent the population? ] -- #### - People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error) -- #### - Or the people who respond to the poll could be systematically different from the people who don't (response error) -- #### - Or your list of potential respondents could be missing people (coverage error) -- *Polls can also go wrong if they have bad question wording, a fourth type of survey error called "measurement error" --- ## The soup principle (in practice) <img src="figures/minestrone.jpg" width="80%" /> --- class: center, middle # Polls today are not soup -- #### - Declining response rates + Internet = innovations in polling online, but they don't use random sampling -- #### - And even traditional RDD polls don't have a true random sample (since response rates are too low) #### - And because of nonresponse --- ## So, to satisfy the soup principle... ### Pollsters use statistical algorithms to ensure their samples match the population on different demographic targets - Race, age, gender, and region are most common --- # Option A: Weighting - Raking, calibration, propensity etc. <img src="figures/raking.jpg" width="100%" /> --- # Option B: Modeling <img src="figures/mrp.jpg" width="100%" /> --- class: center, middle # But the traditional adjustments aren't enough... --- # 2016: Education weighting <img src="figures/weighting_education.jpg" width="100%" /> --- # 2020: Partisan nonresponse <img src="figures/gq_rs_polls.jpg" width="90%" /> --- class: center, middle # But the traditional adjustments aren't enough... -- - Race, age, gender, and region are most common -- - Education, interactions, partisanship are harder, but increasingly necessary --- # The future of polling? ### 1. More weighting variables ### 2. More online and off-phone data colleciton (SMS, mail) ### 3. Mixed samples #### All in the pursuit of getting representative (and politically balanced) samples _before_ the adjustment stage --- class: center # Big lesson 1: -- ## Violating the soup principle = unrepresentative polls -- # Big lesson 2: -- ## ... so what does it do to election forecasting models? --- class: center, inverse, middle # The soup principle and election forecasts --- # 2020 presidential election forecast <img src="figures/economist_forecast_headline.png" width="100%" /> --- # What goes into the model? ### 1. National economic + political fundamentals ### 2. Decompose into state-level priors ### 3. Polls Uncertainty is propogated throughout the models, incorporated via MCMC sampling in step 3. --- # National fundamentals? #### i) Index of economic growth (1940 - 2016) - eight different variables, scaled to measure the standard-deviation from average annual growth #### ii) Presidential approval (1948 - 2016) #### iii) Polarization (1948 - 2016) - measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth #### iv) Whether an incumbent is on the ballot --- <img src="figures/fundamentals_economy.png" width="80%" /> --- <img src="figures/fundamental_approval.png" width="80%" /> --- <img src="figures/fundamentals_with_incumbency.png" width="100%" /> --- # Modeling the fundamentals ### Model formula: vote ~ incumbent_running:economic growth:polarization + approval ### Training Model trained on 1948-2016 using elastic net regression with leave-one-out cross-validation <img src="figures/train-validate-test.png" width="90%" /> RMSE = 2.6 percentage points on two-party Democratic vote share --- # The model is a federalist #### i) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016 * Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state #### ii) Use the covariates to make predictions for 2020, _conditional on the national fundamentals prediction for every day_ #### ii) Simulate state-level outcomes to extract a mean and standard deviation * Propogates uncertainty both from the LOOCV RMSE of the national model and the state-level model --- class: center, inverse, middle # That's the basleine -- # Now, we add the polls --- # Just a trend through points... Can do with any series of packages for R, other statistical languages -- <img src="figures/state_space_model.png" width="90%" /> --- # (...but with some fancy extra stuff) ``` mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior; for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i]; national_mu_b_average = transpose(mu_b) * state_weights; mu_c = raw_mu_c * sigma_c; mu_m = raw_mu_m * sigma_m; mu_pop = raw_mu_pop * sigma_pop; e_bias[1] = raw_e_bias[1] * sigma_e_bias; sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias; for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho; //*** fill pi_democrat for (i in 1:N_state_polls){ logit_pi_democrat_state[i] = mu_b[state[i], day_state[i]] + mu_c[poll_state[i]] + mu_m[poll_mode_state[i]] + mu_pop[poll_pop_state[i]] + unadjusted_state[i] * e_bias[day_state[i]] + raw_measure_noise_state[i] * sigma_measure_noise_state + polling_bias[state[i]]; } ``` --- # Poll-level model -- #### i. Latent state-level vote shares evolve as a random walk over time * Pooling toward the state-level fundamentals more as we are further out from election day -- #### ii. Polls are observations with measurement error that are debiased on the basis of: * Pollster firm (so-called "house effects") * Poll mode * Poll population -- #### iii. Correcting for partisan non-response * Whether a pollster weights by party registration or past vote * (Incorporated as a residual AR process) --- # Debiased predictions #### Notable improvements from partisan non-responseand other weighting issues <img src="figures/states-vs-results.png" width="80%" /> --- # Debiased predictions Notable improvements* from controlling for partisan nonresponse and other weighting issues <img src="figures/state_briers_2016.png" width="60%" /> *In 2016, but not 2020 --- class: center, inverse, middle # Back to the soup.... --- <img src="figures/tomato.jpeg" width="85%" /> --- <img src="figures/tomato.jpeg" width="50%" /> #### If polls are biased, the aggregation model cannot remove the biases. It can only explore them -- #### Yet this means violation of random sampling also violates the statistical theory underpinning election models! -- #### And means we must add ways to incorporate the extra uncertainty from bias. (Especially if partisan nonresponse is getting worse.) --- # Some ideas... ### 1. Add extra model error for uncertainty. But how? -- ### 2. Attempt to debias polls (but that adds uncertainty too) -- ### 3. Only use polls we trust? (But how do we measure trust? Not necessarily a robust solution.) --- ### 4. Communication: Show forecasts for different scenarios of poll error <img src="figures/conditional_forecast_error_2020.png" width="80%" /> Ultimately, we're still figuring out the answer.... --- class: center .pull-left[ <img src="figures/tomato.jpeg" width="85%" /> ] .pull-right[ <img src="figures/minestrone.jpg" width="85%" /> ] <img src="figures/pew_soup.png" width="70%" /> --- .pull-left[ <img src="figures/cover.jpg" width="90%" /> ] .pull-right[ # Thank you! Book comes out July 12 2022 <br> <br> **Website: [gelliottmorris.com](https://www.gelliottmorris.com)** **Twitter: [@gelliottmorris](http://www.twitter.com/gelliottmorris)** ### Questions? ] --- _These slides were made using the `xaringan` package for R. They are available online at https://www.gelliottmorris.com/slides/_