Bayesian data analysis and political journalism(Accurate) storytelling with dataG. Elliott MorrisMar 13, 2023 | Ithaca/New York, NY1 / 40

2 / 40

We're going to go through a few case studies of my work, including: {click through}

Goals in data journalism:3 / 40

Goals in data journalism:1. Analyze a subject that is newsy or noteworthyElections, voting, specific policy votes
3 / 40

Goals in data journalism:1. Analyze a subject that is newsy or noteworthyElections, voting, specific policy votes
2. In a way that is novel or visually striking;Eg, use multi-level regression and poststratification to predict support for abortion rights in each state
3 / 40

Goals in data journalism:1. Analyze a subject that is newsy or noteworthyElections, voting, specific policy votes
In a way that is novel or visually striking;Eg, use multi-level regression and poststratification to predict support for abortion rights in each state
Or in a way that beats the competitionElection forecasts that are fully bayesian
/ 40

But before that, let's talk about our goals. There are three:
1. To analyze a subject that is newsy or noteworthy
1. To do that in a way that is novel or visually striking (or, sometimes, just newsy)
1. Or in a way that beas the competition

Goals in social science:Similar to the goals in data journalism!4 / 40

Goals in social science:Similar to the goals in data journalism!1. Identify a phenomenaMaybe it is a gap in the literature or a new development
4 / 40

Goals in social science:Similar to the goals in data journalism!1. Identify a phenomenaMaybe it is a gap in the literature or a new development
2. Measure itEg, with a survey
4 / 40

Goals in social science:Similar to the goals in data journalism!1. Identify a phenomenaMaybe it is a gap in the literature or a new development
2. Measure itEg, with a survey
3. Explain itOften involves some level of modeling or prediction
Eg, a randomized experiment or regression
4 / 40

Goals of data journalism are not so different from the goals of social science
Adapted here from the book Data Analysis for Social Science by Elena Llaudet and Kosuke Imai
So I hope the presentation can be helpful for students as they decide if they want to go into journalism, which has the distinguished characteristic of being one of the industries that probably has only marginal returns over becoming an academic

Case study one: wacky polls5 / 40

Case study one: wacky pollsMethod: weighting survey data5 / 40

1. Weighting survey data

6 / 40

1. Weighting survey data7 / 40

1. Weighting survey data

7 / 40

1. Weighting survey data

Story: Outlier poll getting a lot of attention. Fishy results, shady firm.
Novelty: Asked pollster for their data, re-weighted it to generate results.
Explanation: Pollsters repeating mistakes of 2016, not weighting data by education
Explanation is visually simple

7 / 40

Explain process for story
While not a story where we needed Bayesian analysis, it could have taught us more:

1. Weighting survey dataPotential Bayesian iteration?Train regression model to predict voting ("target-estimation")
Use features with large coefficients to weight the poll
Drawback?Of marginal utility journalistically
8 / 40

Case study two: hypothetical elections, demographic patterns, uncertainty9 / 40

Case study two: hypothetical elections, demographic patterns, uncertaintyMethod: Multilevel regression and post-stratification (MRP)9 / 40

2. Mister-P

Answer questions like:

What would happen if everyone in America voted?
Which groups support Joe Biden more than Hillary Clinton?
How can we better measure uncertainty in polling?

10 / 40

2. Mister-P: If everyone votedGuiding questions11 / 40

2. Mister-P: If everyone voted

Guiding questions

1. How many Democrats and Republicans are there?

Given data constraints, we're really asking: How many Clinton and Trump voters are there?

11 / 40

2. Mister-P: If everyone voted

Guiding questions

1. How many Democrats and Republicans are there?

Given data constraints, we're really asking: How many Clinton and Trump voters are there?

2. How are they distributed geographically?

The answer lets us assign Electoral College votes.

11 / 40

2. Mister-P: If everyone votedData12 / 40

2. Mister-P: If everyone voted

Data

1. Cooperative Congressional Election Study (CCES): A survey of 64,000 Americans

Includes demographic data and 2016 vote choice for 40,000+ validated voters

12 / 40

2. Mister-P: If everyone voted

Data

1. Cooperative Congressional Election Study (CCES): A survey of 64,000 Americans

Includes demographic data and 2016 vote choice for 40,000+ validated voters

2. American Community Survey (ACS): A Census Bureau survey of 175,000 Americans

Includes the same demographic data as the CCES 380,000 “cells”

12 / 40

2. Mister-P: If everyone votedMethod13 / 40

2. Mister-P: If everyone votedMethod1. Train a predictive model on CCES dataMulti-level logistic regression
Predict vote choice with: age, gender, race, education, region and interactions between them
13 / 40

2. Mister-P: If everyone voted

Method

1. Train a predictive model on CCES data

Multi-level logistic regression
Predict vote choice with: age, gender, race, education, region and interactions between them

2. Use the model to predict voting habits for every eligible American

Via “post-stratification” on the ACS

13 / 40

2. Mister-P: If everyone votedACS Post-stratification14 / 40

2. Mister-P: If everyone votedACS Post-stratification1. Each "type" of person gets their own "cell":One cell for white men ages 18-30 without college degrees who live in the Northeast
Another for non-white men ages 18-30 without college degrees who live in the Northeast
etc.
14 / 40

2. Mister-P: If everyone votedACS Post-stratification1. Each "type" of person gets their own "cell":One cell for white men ages 18-30 without college degrees who live in the Northeast
Another for non-white men ages 18-30 without college degrees who live in the Northeast
etc.
2. We know how many voters in that "cell" live in each state14 / 40

2. Mister-P: If everyone votedACS Post-stratification1. Each "type" of person gets their own "cell":One cell for white men ages 18-30 without college degrees who live in the Northeast
Another for non-white men ages 18-30 without college degrees who live in the Northeast
etc.
2. We know how many voters in that "cell" live in each state3. So we can say that x and y% of each "cell" vote for Clinton or Trump, then add upFor example, a Latino female age 18-30 with a college degree in Texas is 85% likely to vote for a Democrat for president, and there's 20k of them
14 / 40

2. Mister-P: If everyone voted

15 / 40

2. Mister-P: If everyone voted

16 / 40

2. Mister-P: If everyone voted

17 / 40

2. Mister-P: Polling uncertainty

Souce: Groves et al., 2009

18 / 40

2. Mister-P: Polling uncertaintyTraditional margin of error only covers one source of error (sampling)We can use MRP to take non-response and adjustment into account tooVia the posterior predictive distribution19 / 40

2. Mister-P: Polling uncertainty

In `brms` syntax:

20 / 40

2. Mister-P: Polling uncertainty

Model estimates of parameter uncertainty

21 / 40

2. Mister-P: Polling uncertainty

Posterior draws for every cell account for sampling, non-response, and adjustment error

22 / 40

Sampling, via the Bayesian logit updater (and survey weights)
Non-response, via adjustment back to survey frame
And adjustment error, via varying parameter estimates and partial pooling
Or, better yet, via Bayesian model averaging

2. Mister-P: Polling uncertainty

Wider error bars = truer measure of uncertainty in polling

23 / 40

Case study two: Fully Bayesian election forecasting24 / 40

Case study two: Fully Bayesian election forecastingMethod: Dynamic linear model (latent traits + measurement model)24 / 40

3. Elections DLMs

Economist presidential model

1. National economic + political fundamentals

2. Decompose into state-level priors

3. Polls

Uncertainty is propagated throughout the models, incorporated via MCMC sampling in step 3.

25 / 40

3. Elections DLMs

It's just a trend through points...

26 / 40

3. Elections DLMs

(...but with some fancy extra stuff)

mu_b[:,T] = cholesky_ss_cov_mu_b_T * raw_mu_b_T + mu_b_prior;  
for (i in 1:(T-1)) mu_b[:, T - i] = cholesky_ss_cov_mu_b_walk * raw_mu_b[:, T - i] + mu_b[:, T + 1 - i];
national_mu_b_average = transpose(mu_b) * state_weights;
mu_c = raw_mu_c * sigma_c;
mu_m = raw_mu_m * sigma_m;
mu_pop = raw_mu_pop * sigma_pop;
e_bias[1] = raw_e_bias[1] * sigma_e_bias;
sigma_rho = sqrt(1-square(rho_e_bias)) * sigma_e_bias;
for (t in 2:T) e_bias[t] = mu_e_bias + rho_e_bias * (e_bias[t - 1] - mu_e_bias) + raw_e_bias[t] * sigma_rho;
//*** fill pi_democrat
for (i in 1:N_state_polls){
  logit_pi_democrat_state[i] = 
    mu_b[state[i], day_state[i]] + 
    mu_c[poll_state[i]] + 
    mu_m[poll_mode_state[i]] + 
    mu_pop[poll_pop_state[i]] + 
    unadjusted_state[i] * e_bias[day_state[i]] +
    raw_measure_noise_state[i] * sigma_measure_noise_state + 
    polling_bias[state[i]];
}

27 / 40

3. Elections DLMsPoll-level model28 / 40

3. Elections DLMsPoll-level modeli. Latent state-level vote shares evolve as a random walk over timePooling toward the state-level fundamentals more as we are further out from election day
28 / 40

3. Elections DLMsPoll-level modeli. Latent state-level vote shares evolve as a random walk over timePooling toward the state-level fundamentals more as we are further out from election day
ii. Polls are observations with measurement error that are debiased on the basis of:Pollster firm (so-called "house effects")
Poll mode
Poll population
28 / 40

3. Elections DLMsPoll-level modeli. Latent state-level vote shares evolve as a random walk over timePooling toward the state-level fundamentals more as we are further out from election day
ii. Polls are observations with measurement error that are debiased on the basis of:Pollster firm (so-called "house effects")
Poll mode
Poll population
iii. Correcting for partisan non-responseWhether a pollster weights by party registration or past vote
Incorporated as a residual AR process
28 / 40

3. Elections DLMs

Notable improvements from partisan non-responseand other weighting issues

29 / 40

3. Elections DLMs

Notable improvements from adjusting for partisan non-response and other weighting issues

30 / 40

3. Elections DLMs

2016: good!

31 / 40

3. Elections DLMs

2020: not as good!

32 / 40

3. Elections DLMsProblem with non-response/weighting adjustments33 / 40

3. Elections DLMsProblem with non-response/weighting adjustments1. Pollsters change their methods33 / 40

3. Elections DLMsProblem with non-response/weighting adjustments1. Pollsters change their methods2. Not all adjustments work33 / 40

3. Elections DLMsSolution? Conditional forecasting!34 / 40

3. Elections DLMsSolution? Conditional forecasting!- Present aggregates assuming some amount of polling bias.34 / 40

3. Elections DLMsSolution? Conditional forecasting!- Present aggregates assuming some amount of polling bias.- As a way to explain to readers how bias enters the process of polling34 / 40

3. Elections DLMsSolution? Conditional forecasting!- Present aggregates assuming some amount of polling bias.- As a way to explain to readers how bias enters the process of polling- And what happens to forecasts if bias now does not follow historical distributions34 / 40

3. Elections DLMsConditional forecasting:35 / 40

3. Elections DLMs

Conditional forecasting:

1. Debias polls

35 / 40

3. Elections DLMs

Conditional forecasting:

1. Debias polls

2. Rerun simulations

35 / 40

3. Elections DLMs

2. Rerun simulations

36 / 40

3. Elections DLMs

2. Rerun simulations

Advantage: leaves readers with a much clearer picture of possibilities for election outcomes if past patterns of bias aren't predictive of bias now (2016, 2020)

37 / 40

3. Elections DLMs

But exploring parameter conditionality is not always necessary or helpful:

38 / 40

Questions?39 / 40

Thank you!

Website: gelliottmorris.com

Twitter: @gelliottmorris

Questions?

These slides were made using the xaringan package for R. They are available online at https://www.gelliottmorris.com/slides/

40 / 40

↑, ←, Pg Up, k	Go to previous slide
↓, →, Pg Dn, Space, j	Go to next slide
Home	Go to first slide
End	Go to last slide
Number + Return	Go to specific slide
b / m / f	Toggle blackout / mirrored / fullscreen mode
c	Clone slideshow
p	Toggle presenter mode
t	Restart the presentation timer
?, h	Toggle this help

Bayesian data analysis and political journalism

(Accurate) storytelling with data

G. Elliott Morris

Mar 13, 2023 | Ithaca/New York, NY

Goals in data journalism:

Goals in data journalism:

1. Analyze a subject that is newsy or noteworthy

Goals in data journalism:

1. Analyze a subject that is newsy or noteworthy

2. In a way that is novel or visually striking;

Goals in data journalism:

1. Analyze a subject that is newsy or noteworthy

2. In a way that is novel or visually striking;

3. Or in a way that beats the competition

Goals in social science:

Similar to the goals in data journalism!

Goals in social science:

Similar to the goals in data journalism!

1. Identify a phenomena

Goals in social science:

Similar to the goals in data journalism!

1. Identify a phenomena

2. Measure it

Goals in social science:

Similar to the goals in data journalism!

1. Identify a phenomena

2. Measure it

3. Explain it

Case study one: wacky polls

Case study one: wacky polls

Method: weighting survey data

1. Weighting survey data

1. Weighting survey data

1. Weighting survey data

1. Weighting survey data

1. Weighting survey data

Potential Bayesian iteration?

Drawback?

Case study two: hypothetical elections, demographic patterns, uncertainty

Case study two: hypothetical elections, demographic patterns, uncertainty

Method: Multilevel regression and post-stratification (MRP)

2. Mister-P

2. Mister-P: If everyone voted

Guiding questions

2. Mister-P: If everyone voted

Guiding questions

1. How many Democrats and Republicans are there?

2. Mister-P: If everyone voted

Guiding questions

1. How many Democrats and Republicans are there?

2. How are they distributed geographically?

2. Mister-P: If everyone voted

Data

2. Mister-P: If everyone voted

Data

1. Cooperative Congressional Election Study (CCES): A survey of 64,000 Americans

2. Mister-P: If everyone voted

Data

1. Cooperative Congressional Election Study (CCES): A survey of 64,000 Americans

2. American Community Survey (ACS): A Census Bureau survey of 175,000 Americans

2. Mister-P: If everyone voted

Method

2. Mister-P: If everyone voted

Method

1. Train a predictive model on CCES data

2. Mister-P: If everyone voted

Method

1. Train a predictive model on CCES data

2. Use the model to predict voting habits for every eligible American

2. Mister-P: If everyone voted

ACS Post-stratification

2. Mister-P: If everyone voted

ACS Post-stratification

1. Each "type" of person gets their own "cell":

2. Mister-P: If everyone voted

ACS Post-stratification

1. Each "type" of person gets their own "cell":

2. We know how many voters in that "cell" live in each state

2. Mister-P: If everyone voted

ACS Post-stratification

In `brms` syntax: