class: left, top, title-slide # Taking the pulse of ‘the pulse of democracy’ ## From
Strength in Numbers: How Polls Work + Why We Need Them
###
G. Elliott Morris
| Nov 17 2022 | Princeton, NJ --- class: center, middle <img src="figures/cover.jpg" width="50%" /> ??? - here to talk about my book strength in numbers, about how polls work and why we need them for a healthy democracy - but before we get there: a quick recent history of polls in 2016, 2018, 2020 and now 2022 - 2016 was in many ways the catalyst for writing this book - Then, I was still in college at the university of Texas, but I had reverse-engineered the 538 forecasting model in my spare time as a way to learn computer programming --- class: center, middle <img src="figures/538_2016.png" width="75%" /> ??? - You may remember, however, that polls that year were not very accurate - Clinton winning in Michigan, Wisconsin, pennsylvania, Florida and NC - All of which she lost --- class: center, middle <img src="figures/wapo_2016_states.png" width="90%" /> ??? - Overall, polls missed by 5 points --- class: center, middle <img src="figures/538_2016_states.png" width="90%" /> ??? - Looking at polls that were released only in the final 21 days of the campaign and only in competitive states, the average state poll missed Hillary Clinton’s margin over Donald Trump by about 3 points - But even worse, most of those polls systematically underestimated Trump — they had high bias not just high error - In fact, the average poll was more biased than at any point since 1998 --- class: center, middle <img src="figures/phone.jpeg" width="70%" /> ??? - To understand the reason for that miss you must understand the fundamental problem that pollsters have to solve: nonresponse. - That is: the people who answer a poll may be different — demographically, politically, culturally — than the people who don’t. - And it turns out that solving that equation is very hard, in part because there is a lot of randomness in polls due to low response rates. --- ### Pollsters use statistical algorithms to ensure their samples match the population on different demographic targets - Race, age, gender, and region are most common - Political variables (sometimes) - Can use weighting (raking) modeling (MRP), w various tradeoffs .pull-left[ <img src="figures/raking.jpg" width="100%" /> ] .pull-right[ <img src="figures/mrp.jpg" width="100%" /> ] ??? - Pollster get around nonresponse by adjusting (or “weighting” their samples to be demographically representative of the electorate. - If their poll should have 70% white voters, for example, but only 35% of respondents are white, then all the white voters get a weight of 2 and everyone else gets halved. They repeat that process for a bunch of traits: race, age, education, gender, region, etc. --- class: center, middle <img src="figures/pew_weights.png" width="80%" /> ??? - Pew Research Center had 12 weighting variables in 2020! --- class: center, middle ## Demographic nonresponse bias <img src="figures/weighting_education.jpg" width="100%" /> ??? - The canonical explanation now is that in 2016, the nearly uniform bias in polls came higher rates of nonresponse among non-college-educated voters - And you can see that with this graph. - The NYT estimates the if all national 2016 had weighted their data to be representative of race and education, their error would have dropped from 9 points to 2 points. Even though that’s not perfect, it is a big improvement! --- class: center, middle <img src="figures/aapor.png" width="100%" /> ??? - Now, this may sound like a severe problem. But actually, polls today are much more accurate than they used to be. - (Bias can be high at the state level, yes, but that’s not necessarily inherent to the process of polling. 2022 proved that was true, but we’ll get there.) - The graph here shows error and bias in national election polls going back to 1936. - You can see that the error bars are much lower now than they used to be. So even while state polls are in their worst shape since 1998, 1998 polls were a heck of a lot better than they were in 1946. --- class: center, middle <img src="figures/cover.jpg" width="50%" /> ??? - And that was the catalyst for writing this book. - I wanted to write a persuasive case that polls weren’t as bad as the media made them out to be. - We got headlines like “polling is useless” and that the industry suffered a “catastrophic error.” - But what people in the industry know is that that was wrong. Polls suffered a slightly above-average error in a close race that threw off most forecasts. That happens all the time! - So I worked on the first draft of my book in 2019. In early 2020, I signed a contract for it— with the idea that 2020 would provide a critical test for the industry. Could pollsters improve their methods and prove that they were relevant again? Would they fix all their problems and give perfect forecasts of elections again? --- class: center, middle <img src="figures/tnr_2020_polls.png" width="100%" /> ??? - Well, it turns out, they wouldn’t. --- class: center, middle <img src="figures/538_2020_states.png" width="75%" /> ??? - At the state level, polls in 2020 underestimated Donald Trump by 4 points. That was an even larger bias than in 2016! - That was a shock to many in the industry who had changed their methods since the last time around. Even the pollsters who were adjusting their polls to be demographically representative of the population had large errors. --- class: center, middle ## Partisan nonresponse bias <img src="figures/gq_rs_polls.jpg" width="80%" /> ??? - That’s because the problem in 2020 was not demographic nonresponse, but political nonresponse. Pollsters got the wrong mix of Trump voters within demographic groups. This time, the white non-college voters that answer their polls were systematically too democratic. --- ## Partisan nonresponse bias -- - ### Problem reaching Trump voters overall -- - ### And _within_ demographic groups -- - ### Something you cannot fix with weighting -- - #### Pollsters can adjust for past vote, but the electorate changes, and certain _types_ of voters may not respond to surveys ??? - And that’s something that is very, very hard to adjust for after the fact. - For one thing, there is no official record from the Census of how many democrats there are in the country. - And while you could run models on past polling data and try to weight to the right percentage of white 2016 Trump voters, that doesn’t help you among people who changed their minds last time. --- class: center, middle <img src="figures/cover.jpg" width="50%" /> ??? - And so I rewrote most of the book on polls. - Instead of focusing on why they worked, I interrogated why they didn’t. I read the academic and statistical journals about polling methods and wrote 4 chapters about how their work in reality. - How the science — and art — of surveying works. And that set up a good conversation about the future of polls: the methods and tools pollsters would use in the future to save their industry from the specter of nonresponse. --- # Make polls work again -- ### 1. More weighting variables (NYT) -- ### 2. More online and off-phone data colleciton (SMS, mail) -- ### 3. Mixed samples (private pollsters) -- ### In the pursuit of getting representative (and politically balanced) samples _before and after_ the adjustment stage ??? - From the standpoint of late 2021, when I finished edits on the book, that future looked pretty bright. Pollsters were trying a few things to avoid the mitigate as best as possible the problems of demographic and political nonresponse that messed up their polls last time. - One, many of them had started weighting their data to reasonable benchmarks for partisanship and/or past voter behavior. But as we said, that is no guarantee of accuracy. - So another thing they did is invest in the “design stage” of the poll. That’s what pollsters call all the steps they take to collect their data before adjusting (or “post processing” or “modeling”) it. - The New York Times, for example, changed their sampling procedure so they would have the chance of interviewing respondents in proportion to the makeup of registered Republicans and Democrats in each state house district win the country. This way they would get rural Republicans and urban Republicans in their correct proportions — instead of, stay, the more liberal urban Republicans overwhelming conservative rural ones. --- class: center, middle <img src="figures/cohn_2022.png" width="95%" /> ??? - And that paid off for them. According to Nate Cohn, they just had their best year ever. --- class: center, middle <img src="figures/economist_2022.jpeg" width="95%" /> ??? - Actually, polls everywhere just had their best year ever. - In my analysis of Senate polling for The Economist, I found the average error for our polling averages to tie the record lowest error for any election since 1998 (tied with 2006). And bias average bias was also very low. Meaningfully lower than in 2016 and 2020, and even beating 2018, when polls also had a good year. --- class: center, middle <img src="figures/ekins_2022.jpeg" width="95%" /> ??? - Moreover, polls from the traditional firms outperformed new startups that do not share their methods and have tended to overestimate support for Republicans. In other words, if you throw out the junk data, this year was a really really good year for the pollsters. --- class: center, middle, inverse # Polling: vindicated? ??? - So what does this all mean for pollsters? --- ## Polling: vindicated? -- ### 1. Death of polling is greatly exaggerated -- ### 2. Luck plays a big role in getting elections "right" -- ### 3. High nonresponse is not necessarily directional ??? - There are four big lessons. The first is that claims of the demise of the political survey research industry were greatly exaggerated. People like me have been saying that all along — but, fair enough, now we have the data to back it up. - But the second is that a lot of the industry’s accuracy is just down to luck. Although some firms like the NYT made methodological improvements, most didn’t. And they still did well. - That’s because high political nonresponse is not a constant in survey research. It does not increase the long-term directional bias of polling, but increases uncertainty about bias from year to year. So you also shouldn’t expect polls to be as accurate as they were this year in 2024. --- class: center, middle, inverse ## Give polls a chance --- class: center, middle <img src="figures/voting.jpg" width="95%" /> ??? - Because if accuracy can lead you to give pollsters another chance, perhaps we can use polls for better things than predicting elections: like helping people get what they want from their leaders. Like: democracy. --- class: center, middle <img src="figures/cover.jpg" width="50%" /> ??? - So remember when I said I wrote 4 book chapters about polling methods? Well, that’s a short book. It’s actually 7 chapters—and the other 3 are about the history of “public opinion” as a concept in our democracy, and about how polls get used by our leaders to shape politics and public policy. - There are a few stories worth mentioning just to make my case. In social science these get called “case studies” but really they’re just anecdotes. Anyway: --- class: center, middle <img src="figures/butler_nickerson.png" width="95%" /> ??? - The first story: Butler and Nickerson --- class: center, middle <img src="figures/old_comp.jpg" width="95%" /> ??? - The second: John F Kennedy and the 1960 presidential campaign --- class: center, middle <img src="figures/iraq_polls.png" width="95%" /> ??? - The third: fraudulent data in Iraq and Afghanistan and usage in the war --- class: center, middle <img src="figures/democracy.jpg" width="95%" /> ??? - And then there’s the democratic theory of it all. - We live in a country with a government that at least claims to represent its citizens. - The mode by which our government provides that representation is in holding elections for public officials that pass laws on voters’ behalf. --- class: center, middle <img src="figures/elections.jpg" width="95%" /> ??? - But that is limiting in two respect. First, representation depends on the fairness of elections and the salience of issues that come up in them. - If an election does not incorporate the will of all the citizens, then the system is not representing everyone as it claims to. - But it also won’t provide congruence between the will of the people and actual public policy unless the issue in question is an issue in the election. If a war breaks out between the United States and Mexico tomorrow, legislators will have no idea how to vote on that issue. --- class: center, middle, inverse ## Unless… someone takes a poll --- class: center, middle <img src="figures/electoral_college.png" width="60%" /> ??? - Representation also depends on the structure of political institutions and the willingness of our leaders to listen to us. - In the United States today, the US Senate and Electoral College are both meaningfully biased toward Republicans. Our calculations show Democrats would need to win the national popular vote for president by 3 points to win the White House and win the Senate by 6 points to win a majority of its seats. (And then they’d have to fight the filibuster to get anything done.) - This has the effect of insulating whichever party is advantaged by those institutions from the will of the people. That is Republicans today, since they win more rural voters than Democrats. So when 60 or 70% of Americans support something like, say, universal background checks for gun purchases or the legitimacy of the 2020 US election, they don’t have to listen. - So you don't know what the sum of the people really want... --- class: center, middle, inverse ## Unless… someone takes a poll ??? - So the only way we know what people want… is the polls. - And it’s incumbent on us to listen to them. - There is a part of the book that talks about the fundamentally democratic nature of taking a poll. - When you ask a body politic what it thinks or wants, you are asserting its existence as a collective. You are capturing the inherent community of that group. And you are asserting that some decision role should dictate what it wants. In our country, that is majority — not minority — rule. --- class: center, middle <img src="figures/medicare.jpeg" width="100%" /> ??? - The case studies in my book show that the people can use polls for good. That lawmakers do react to them and that they can steer public policy in the direction of whatever the people want from the leaders. - The analysis I provide supports the view that polls are, on average, and if conducted and analyzed correctly, reliable indicators of what the people want. - And my hope is that people like you will come to see public opinion polls not as tools only for predicting elections, but of improving our democracy. --- class: center, middle <img src="figures/cover.jpg" width="50%" /> ??? - So that’s my book. That’s just a selection of my read on the state of the industry and my philosophy of public opinion polling. - I hope you’ll buy the book and read the fuller version. Thank you. --- .pull-left[ <img src="figures/cover.jpg" width="90%" /> ] .pull-right[ # Thank you! ### _STENGTH IN NUMBERS_ is Now available. <br> <br> **Website: [gelliottmorris.com](https://www.gelliottmorris.com)** **Twitter: [@gelliottmorris](http://www.twitter.com/gelliottmorris)** ### Questions? ] --- _These slides were made using the `xaringan` package for R. They are available online at https://www.gelliottmorris.com/slides/_