class: left, top, title-slide # What’s the matter with polling? ## From
Strength in Numbers: How Polls Work + Why We Need Them
###
G. Elliott Morris
| October 13, 2022 | Berkeley, CA --- <img src="figures/cover.jpg" width="50%" /> --- <img src="figures/tomato.jpeg" width="80%" /> --- # The "soup principle" <img src="figures/tomato.jpeg" width="60%" /> --- class: center, inverse, middle # The first polls --- # "Straw" polls <img src="figures/polls_street.jpg" width="80%" /> --- <img src="figures/digest_poll.jpeg" width="70%" /> --- <img src="figures/digest_1936.jpg" width="60%" /> --- # The first ("scientific") polls ### - Conducted face-to-face -- ### - Used demographic quotas for representativeness - Race, gender, age, geography -- ### - Beat straw polls in accuracy (1936) - By shrinking bias from demographic nonresponse -- --- # The first ("scientific") polls ### - But fell short of true survey science (1948) <img src="figures/dewey_truman.jpeg" width="60%" /> --- # Polls 2.0 ### - SSRC says: area sampling -- <img src="figures/houston.png" width="60%" /> --- # Polls 2.0 ### - SSRC says: area sampling ### - Gallup implements some partisan controls - Strata are groups of precincts by 1948 vote choice -- ### - Use rough quotas within geography -- ### - But, preserve interviewer bias -- --- # Polls 3.0 <img src="figures/phone.jpeg" width="70%" /> -- ### Technological change -> better methods --- # Polls 3.0 ### - 1970s: true random sampling (for people with phones) ### - Response rates above 70-80% ### - Rarer instances of severe nonresponse bias ### - Cheaper to conduct = many news orgs poll (CBS, NYT) --- <img src="figures/aapor.png" width="80%" /> --- # The soup principle: satisfied? <img src="figures/pew_soup.png" width="80%" /> _Source: Pew Research Center_ --- # The soup principle: satisfied? ### 1. RDD polls are representative (at high response) ### 2. Availability of many different surveys allow for extra layer of aggregation to control for choices made by individual researcheers --- class: center, inverse, middle # = perfect polls forever, <br><br> -- # ...right? --- ### Technological change -> worse methods? <img src="figures/pew_response_rate.jpg" width="60%" /> _Source: Pew Research Center_ --- ### Polarized voting -> harder sampling <img src="figures/affpol.png" width="75%" /> _Source: Webster & Abramowitz 2017_ --- .center[ ## But what if the people you sample don't represent the population? ] -- #### - People could be very dissimilar by group, meaning small deviations in sample demographics cause big errors (sampling error) -- #### - Or the people who respond to the poll could be systematically different from the people who don't (response error) -- #### - Or your list of potential respondents could be missing people (coverage error) -- *Polls can also go wrong if they have bad question wording, a fourth type of survey error called "measurement error" --- ## The soup principle in theory <img src="figures/pew_soup.png" width="90%" /> _Source: Pew Research Center_ --- ## The soup principle in practice <img src="figures/minestrone.jpg" width="60%" /> --- class: center, middle # Polls today are not soup -- #### - Declining response rates + Internet = innovations in polling online, but they don't use random sampling -- #### - And even traditional RDD polls don't have a true random sample (since response rates are too low) #### - And because of nonresponse --- ## So, to satisfy the soup principle... ### Pollsters use statistical algorithms to ensure their samples match the population on different demographic targets - Race, age, gender, and region are most common - Variety of methods (weighting, modeling) available .pull-left[ <img src="figures/raking.jpg" width="100%" /> ] .pull-right[ <img src="figures/mrp.jpg" width="100%" /> ] --- # These adjustments make polls pretty good! <img src="figures/aapor.png" width="75%" /> --- class: center, middle, # But in close races, they aren't enough: --- # 2016: Education weighting <img src="figures/weighting_education.jpg" width="100%" /> --- # 2020: Partisan nonresponse <img src="figures/gq_rs_polls.jpg" width="90%" /> --- # 2020: Partisan nonresponse <img src="figures/gq_rs_polls.jpg" width="40%" /> -- - ### Problem reaching Trump voters overall -- - ### And _within_ demographic groups -- - ### Something you cannot fix with weighting -- - #### Pollsters can adjust for past vote, but the electorate changes, and certain _types_ of eg Trump voters may not respond to surveys --- class: center, middle, inverse # Polls and soup in 2022 -- <br> <br> ## A few ways forward: --- # Making polls work again -- ### 1. More weighting variables (NYT) -- ### 2. More online and off-phone data colleciton (SMS, mail) -- ### 3. Mixed samples (private pollsters) -- ### In the pursuit of getting representative (and politically balanced) samples _before and after_ the adjustment stage --- class: center, middle ### In the pursuit of getting representative (and politically balanced) samples _before and after_ the adjustment stage -- ### To satisfy the soup principle --- class: center, middle, inverse # Further questions: --- # What if that doesn't work? ### 2022 a critical test: does surveys get better or stay the same — or do they get worse? ### What if the DGP remains biased? ### What if the quality of the average poll continues to fall? --- ### Can we trust polls to be precise in close elections? ### If not, what are they good for? --- class: center, middle # How Polls Work <u>and Why We Need Them</u> --- .pull-left[ <img src="figures/cover.jpg" width="90%" /> ] .pull-right[ # Thank you! ### _STENGTH IN NUMBERS_ is Now available. <br> <br> **Website: [gelliottmorris.com](https://www.gelliottmorris.com)** **Twitter: [@gelliottmorris](http://www.twitter.com/gelliottmorris)** ### Questions? ] --- _These slides were made using the `xaringan` package for R. They are available online at https://www.gelliottmorris.com/slides/_