Bayesian data analysis and political journalism

# Bayesian data analysis and political journalism
## (Accurate) storytelling with data
### <strong>G. Elliott Morris</strong>
### Mar 13, 2023 | Ithaca/New York, NY

---

]

]

???

- We're going to go through a few case studies of my work, including: {click through}

---

# Goals in data journalism:

### 1. Analyze a subject that is newsy or noteworthy
- Elections, voting, specific policy votes

### 2. In a way that is novel or visually striking;
- Eg, use multi-level regression and poststratification to predict support for abortion rights in each state

### 3. Or in a way that beats the competition
- Election forecasts that are fully bayesian

???

- But before that, let's talk about our goals. There are three:

- 1. To analyze a subject that is newsy or noteworthy

- 2. To do that in a way that is novel or visually striking (or, sometimes, just newsy)

- 3. Or in a way that beas the competition

---

# Goals in social science:

### Similar to the goals in data journalism!

### 1. Identify a phenomena
- Maybe it is a gap in the literature or a new development

### 2. Measure it
- Eg, with a survey

### 3. Explain it
- Often involves some level of modeling or prediction
- Eg, a randomized experiment or regression

???

- Goals of data journalism are not so different from the goals of social science

- Adapted here from the book Data Analysis for Social Science by Elena Llaudet and Kosuke Imai

- So I hope the presentation can be helpful for students as they decide if they want to go into journalism, which has the distinguished characteristic of being one of the industries that probably has only marginal returns over becoming an academic

---

# Case study one: wacky polls

## Method: weighting survey data

---

## 1. Weighting survey data

---

## 1. Weighting survey data

--
.pull-left[
<img src="figures/center_street_poll.png" width="100%" />
]

--
.pull-right[

- Story: Outlier poll getting a lot of attention. Fishy results, shady firm.

- Novelty: Asked pollster for their data, re-weighted it to generate results.

- Explanation: Pollsters repeating mistakes of 2016, not weighting data by education

- Explanation is visually simple

]

???

- Explain process for story

- While not a story where we needed Bayesian analysis, it could have taught us more:

---

## 1. Weighting survey data

#### Potential Bayesian iteration?

- Train regression model to predict voting ("target-estimation")
- Use features with large coefficients to weight the poll

#### Drawback?
- Of marginal utility journalistically

---

# Case study two: hypothetical elections, demographic patterns, uncertainty

## Method: Multilevel regression and post-stratification (MRP)

---

# 2. Mister-P

Answer questions like:

- What would happen if everyone in America voted?
- Which groups support Joe Biden more than Hillary Clinton?
- How can we better measure uncertainty in polling?

---
## 2. Mister-P: If everyone voted

### Guiding questions

#### 1. How many Democrats and Republicans are there?

Given data constraints, we're really asking: How many Clinton and Trump voters are there?

#### 2. How are they distributed geographically?

The answer lets us assign Electoral College votes.

---

## 2. Mister-P: If everyone voted
### Data

#### 1. Cooperative Congressional Election Study (CCES): A survey of 64,000 Americans

Includes demographic data and 2016 vote choice for 40,000+ validated voters

#### 2. American Community Survey (ACS): A Census Bureau survey of 175,000 Americans

Includes the same demographic data as the CCES
380,000 “cells”

---

## 2. Mister-P: If everyone voted
### Method

#### 1. Train a predictive model on CCES data

- Multi-level logistic regression
- Predict vote choice with: age, gender, race, education, region and interactions between them

#### 2. Use the model to predict voting habits for every eligible American

Via “post-stratification” on the ACS

---

## 2. Mister-P: If everyone voted

### ACS Post-stratification

#### 1. Each "type" of person gets their own "cell":
  - One cell for white men ages 18-30 without college degrees who live in the Northeast
  - Another for non-white men ages 18-30 without college degrees who live in the Northeast
  - etc.

#### 2. We know how many voters in that "cell" live in each state

#### 3. So we can say that x and y% of each "cell" vote for Clinton or Trump, then add up
  - For example, a Latino female age 18-30 with a college degree in Texas is 85% likely to vote for a Democrat for president, and there's 20k of them

---
## 2. Mister-P: If everyone voted

---
## 2. Mister-P: If everyone voted

---

## 2. Mister-P: If everyone voted

---
## 2. Mister-P: Polling uncertainty

_Souce: Groves et al., 2009_

---
## 2. Mister-P: Polling uncertainty

### Traditional margin of error only covers one source of error (sampling)
### We can use MRP to take non-response and adjustment into account too
### Via the posterior predictive distribution

---
## 2. Mister-P: Polling uncertainty

### In `brms` syntax:

---
## 2. Mister-P: Polling uncertainty
### Model estimates of parameter uncertainty

---
## 2. Mister-P: Polling uncertainty
### Posterior draws for every cell account for sampling, non-response, and adjustment error

???

- Sampling, via the Bayesian logit updater (and survey weights)

- Non-response, via adjustment back to survey frame

- And adjustment error, via varying parameter estimates and partial pooling
- Or, better yet, via Bayesian model averaging

---
## 2. Mister-P: Polling uncertainty
### Wider error bars = truer measure of uncertainty in polling

---

# Case study two: Fully Bayesian election forecasting

## Method: Dynamic linear model (latent traits + measurement model)

---

## 3. Elections DLMs

## _Economist_ presidential model

### 1. National economic + political fundamentals

### 2. Decompose into state-level priors

### 3. Polls

Uncertainty is propagated throughout the models, incorporated via MCMC sampling in step 3.

---

## 3. Elections DLMs
### It's just a trend through points...

---
## 3. Elections DLMs

#### (...but with some fancy extra stuff)

```
mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;  
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
  logit_pi_democrat_state[i] = 
    mu_b[state[i], day_state[i]] + 
    mu_c[poll_state[i]] + 
    mu_m[poll_mode_state[i]] + 
    mu_pop[poll_pop_state[i]] + 
    unadjusted_state[i] * e_bias[day_state[i]] +
    raw_measure_noise_state[i] * sigma_measure_noise_state + 
    polling_bias[state[i]];
}

```

---
## 3. Elections DLMs

### Poll-level model

##### i. Latent state-level vote shares evolve as a random walk over time
* Pooling toward the state-level fundamentals more as we are further out from election day

##### ii. Polls are observations with measurement error that are debiased on the basis of:
* Pollster firm (so-called "house effects")
* Poll mode
* Poll population

##### iii. Correcting for partisan non-response
* Whether a pollster weights by party registration or past vote
* Incorporated as a residual AR process

---
## 3. Elections DLMs
#### Notable improvements from partisan non-responseand other weighting issues

---

## 3. Elections DLMs
#### Notable improvements from adjusting for partisan non-response and other weighting issues

---

## 3. Elections DLMs

### 2016: good!

---

## 3. Elections DLMs

### 2020: not as good!

---
## 3. Elections DLMs

### Problem with non-response/weighting adjustments

## 1. Pollsters change their methods

## 2. Not all adjustments work

---
## 3. Elections DLMs

## Solution? Conditional forecasting!

--
### - Present aggregates assuming some amount of polling _bias_.

--
### - As a way to explain to readers how bias enters the process of polling

--
### - And what happens to forecasts if bias _now_ does not follow historical distributions

---
## 3. Elections DLMs

### Conditional forecasting:

--
.pull-left[
#### 1. Debias polls

<img src="figures/conditional_forecasting_one.png" width="80%" />
]

<img src="figures/conditional_forecasting_two.png" width="80%" />
]

---
## 3. Elections DLMs

### 2. Rerun simulations
<img src="figures/conditional_forecasting_two.png" width="80%" />

---
## 3. Elections DLMs

### 2. Rerun simulations

#### Advantage: leaves readers with a much clearer picture of possibilities for election outcomes _if past patterns of bias aren't predictive of bias now_ (2016, 2020)

---
## 3. Elections DLMs

### But exploring parameter conditionality is not always necessary or helpful:

---
class: center, middle, inverse

# Questions?

---

# Thank you!

**Website: [gelliottmorris.com](https://www.gelliottmorris.com)**

**Twitter: [@gelliottmorris](http://www.twitter.com/gelliottmorris)**

### Questions?

---

_These slides were made using the `xaringan` package for R. They are available online at https://www.gelliottmorris.com/slides/_