What’s the matter with polling?

class: left, top, title-slide

# What’s the matter with polling?
## From Strength in Numbers: How Polls Work + Why We Need Them
### G. Elliott Morris | Oct 18 2022 | Pittsburgh, PA

---

---

---

# The "soup principle"
<img src="figures/tomato.jpeg" width="60%" />

---
class: center, inverse, middle

# The first polls

---

# "Straw" polls

---

---

---

# The first ("scientific") polls

### - Conducted face-to-face
--

### - Used demographic quotas for representativeness
  - Race, gender, age, geography

### - Beat straw polls in accuracy (1936) 
  - By shrinking bias from demographic nonresponse

---
# The first ("scientific") polls

### - But fell short of true survey science (1948)

---

# Polls 2.0

### - SSRC says: area sampling
--

---

# Polls 2.0

### - SSRC says: area sampling

### - Gallup implements some partisan controls
  - Strata are groups of precincts by 1948 vote choice

### - Use rough quotas within geography
--

### - But, preserve interviewer bias
--

---

# Polls 3.0

### Technological change ->  better methods

---

# Polls 3.0

### - 1970s: true random sampling (for people with phones)
### - Response rates above 70-80%
### - Rarer instances of severe nonresponse bias
### - Cheaper to conduct = many news orgs poll (CBS, NYT)

---

_Source: American Association of Public Opinion Research_

---
# The soup principle: satisfied?

_Source: Pew Research Center_

---
# The soup principle: satisfied?

### 1. RDD polls are representative (at high response)
### 2. Availability of many different surveys allow for extra layer of aggregation to control for choices made by individual researcheers

---
class: center, inverse, middle

# = perfect polls forever,

--
# ...right?

---

### Technological change ->  worse methods?

_Source: Pew Research Center_
---

### Polarized voting -> harder sampling

_Source: Webster & Abramowitz 2017_

---

.center[
## But what if the people you sample don't represent the population?
]
--

#### - People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error)
--

#### - Or the people who respond to the poll could be systematically different from the people who don't (response error)
--

#### - Or your list of potential respondents could be missing people (coverage error)
--

&nbsp;

*Polls can also go wrong if they have bad question wording, a fourth type of survey error called "measurement error"

---

## The soup principle in theory

_Source: Pew Research Center_
---

## The soup principle in practice
<img src="figures/minestrone.jpg" width="60%" />

---
class: center, middle

# Polls today...

&nbsp;

--
#### - Declining response rates + Internet = innovations in polling online, but they don't use random sampling

#### - Traditional RDD and even RBS polls don't have a true random sample (since response rates are too low)

#### - And because of nonresponse

---

## So, to satisfy the soup principle...

### Pollsters use statistical algorithms to ensure their samples match the population on different demographic targets

- Race, age, gender, and region are most common

- Can use weighting (raking) modeling (MRP), w various tradeoffs

.pull-left[
<img src="figures/raking.jpg" width="100%" />
]
.pull-right[

<img src="figures/mrp.jpg" width="100%" />
]

---
# These adjustments make polls pretty good!

---
class: center, middle,

# But they aren't _representative_, per the theory of sampling

# ...and in close races, the adjustments aren't enough:

---

class: inverse, center, middle

# Two examples:

---

# 2016: Education weighting

---

# 2020: Partisan nonresponse

---

# 2020: Partisan nonresponse

--
- ### Problem reaching Trump voters overall

--
- ### And _within_ demographic groups

--
- ### Something you cannot fix with weighting

--
  - #### Pollsters can adjust for past vote, but the electorate changes, and certain _types_ of voters may not respond to surveys
  
---
class: center middle

# So what are we left with?

---

# So what are we left with?

### 1. Traditional polls that oscillate wildly due to intensive weighting

### 2. New "model-based" methods which trade lower variance for higher (potential) bias

### 3. Lower response rates increase chance of big misses across firms
    
---
class: center, middle, inverse

# Polls (and soup?) in 2022

## A few ways forward:

---

# Making polls work again

--
### 1. More weighting variables (NYT)

--
### 2. More online and off-phone data colleciton (SMS, mail)

--
### 3. Mixed samples (private pollsters)

--
### In the pursuit of getting representative (and politically balanced) samples _before and after_ the adjustment stage

---
class: center, middle

### In the pursuit of getting representative (and politically balanced) samples _before and after_ the adjustment stage

### To satisfy the soup principle

---
class: center, middle

# What about aggregation?

### Forecasters have a few tricks up our sleeves:

---
class: center, middle, inverse

# How forecasts work

---

# What goes into the model?

### 1. National economic + political fundamentals

### 2. Decompose into state-level priors

### 3. Add the (average of) polls

---
# 2. National fundamentals?

### i) Index of economic growth (1940 - 2016)
- eight different variables, scaled to measure the standard-deviation from average annual growth

### ii) Presidential approval (1948 - 2016)

### iii) Polarization (1948 - 2016)
- measured as the share of swing voters in the electorate, per the ANES --- and interacted with economic growth

### iv) Whether an incumbent is on the ballot

---

---

---

---

# 2. The model is a federalist

#### i) Train a model to predict the Democratic share of the vote in a state relative to the national vote, 1948-2016
* Variables are: lean in the last election, lean two elections ago, home state effects * state size, conditional on the national vote in the state

#### ii) Use the covariates to make predictions for 2020, _conditional on the national fundamentals prediction for every day_

#### ii) Simulate state-level outcomes to extract a mean and standard deviation
* Propogates uncertainty both from the LOOCV RMSE of the national model and the state-level model

---
class: center, inverse, middle

# That's the baseline

# Now, we add the polls

---

# 3. Add the (average of) polls

- Just a trend through points...
- Can do with any series of packages for R, other statistical languages

---

# 3. Add the (average of) polls
### (...but with some fancy extra stuff)

```{Stan
mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;  
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
  logit_pi_democrat_state[i] = 
    mu_b[state[i], day_state[i]] + 
    mu_c[poll_state[i]] + 
    mu_m[poll_mode_state[i]] + 
    mu_pop[poll_pop_state[i]] + 
    unadjusted_state[i] * e_bias[day_state[i]] +
    raw_measure_noise_state[i] * sigma_measure_noise_state + 
    polling_bias[state[i]];
}

```

---

# 3. Add the (average of) polls

#### i. Latent state-level vote shares evolve as a random walk over time
* "Walks" toward the state-level fundamentals more as we are further out from election day

#### ii. Polls are observations with measurement error that are debiased on the basis of:
* Pollster firm (so-called "house effects")
* Poll mode
* Poll population
* Bias in previous elections

#### iii. Correcting for partisan non-response
* Whether a pollster weights by party registration or past vote
* Adjusts for biases that remain AFTER removing the other biases

---

# 3. Add the (average of) polls
#### Notable improvements from partisan non-response (and other?) issues

---
class: center, middle

# In 2016...

## ... But not 2020

---

# One more lesson:

### 1. Traditional polls that oscillate wildly due to intensive weighting

### 2. New "model-based" methods which trade lower variance for higher (potential) bias

### 3. Lower response rates increase chance of big misses across firms
    
--

### 4. Aggregation is not a magic bullet

---

# 4. Aggregation is not a magic bullet

### What may be more useful than forecasting...

---
class: center, middle

# Conditional forecasting!

---

## Conditional forecasting:

--
.pull-left[
## 1. Debias polls

<img src="figures/conditional_forecasting_one.png" width="80%" />
]

.pull-right[
## 2. Rerun simulations

<img src="figures/conditional_forecasting_two.png" width="80%" />
]

---

# 2. Rerun simulations
<img src="figures/conditional_forecasting_two.png" width="80%" />

---

# 2. Rerun simulations

### Advantage: leaves readers with a much clearer picture of possibilities for election outcomes _if past patterns of bias aren't predictive of bias_ (2016, 2020)

---
class: center, middle, inverse

# Further questions:

---
# What if that doesn't work?

### 2022 a critical test: does surveys get better or stay the same — or do they get worse?
### What if the DGP remains biased?
### What if the quality of the average poll continues to fall?

---

### Can we trust polls to be precise in close elections?
### If not, what are they good for?

---
class: center, middle

# How Polls Work and Why We Need Them

---

.pull-left[
<img src="figures/cover.jpg" width="90%" />
]

.pull-right[

# Thank you!

### _STENGTH IN NUMBERS_ is Now available.

**Website: [gelliottmorris.com](https://www.gelliottmorris.com)**

**Twitter: [@gelliottmorris](http://www.twitter.com/gelliottmorris)**

### Questions?

]

---

_These slides were made using the `xaringan` package for R. They are available online at https://www.gelliottmorris.com/slides/_