polling bias

Back in February I did a post called Women, Ovulation and Voting in 2016, about various researchers attempts to prove or disprove a link between menstrual cycles and their voting preferences. As part of that critique, I had brought up a point that Andrew Gelman made about the inherently dubious nature of anyone claiming to find a 20+ point swing in voting preference. People just don’t tend to vary their party preference that much over anything, so they claim on it’s face is suspect.

I was thinking of that this week when I saw a link to this HBR article from back in April that sort of gender-flips the ovulation study. In this research (done in March), they asked men whether they would vote for Trump or Clinton if the election were today. For half of the men they first asked them a question about how much their wives made in comparison to them. For the other half, they got that question after they’d stated their political preference. The question was intended to be a “gender prime” to get men thinking about gender and present a threat to their sense of masculinity. Their results showed that men who had to think about gender roles prior to answering political preference showed a 24 point shift in voting patterns. The “unprimed” men (who were asked about income after they were asked about political preference) had preferred Clinton by 16 points, and the “primed” men preferred Trump by 8 points. If the question was changed to Sanders vs Trump, the priming didn’t change the gap at all. For women, being “gender primed” actually increased support for Clinton and decreased support for Trump.

Now given my stated skepticism of 20+ point swing claims, I decided to check out what happened here. The full results of the poll are here, and when I took a look at the data there was one thing that really jumped out at me: a large percent of the increased support for Trump came from people switching from “undecided/refuse to answer/don’t know” to “Trump”. Check it out, and keep in mind the margin of error is +/-3.9:

So basically men who were primed were more likely to give an answer (and that answer was Trump) and women who were primed were less like to answer at all. For the Sanders vs Trump numbers, that held true for men as well:

In both cases there was about a 10% swing in men who wouldn’t answer the question when they were asked candidate preference first, but would answer the question if they were “primed” first. Given the margin of error was +/-3.9 overall, this swing seems to be the critical factor to focus on…..yet it was not mentioned in the original article. One could argue that hearing about gender roles made men get more opinionated, but isn’t it also plausible the order of the questions caused a subtle selection bias? We don’t know how many men hung up on the pollster after being asked about their income with respect to their wives, or if that question incentivized other men to stay on the line. It’s interesting to note that men who were asked about their income first were more likely to say they outearned their wives, and less likely to say they earned “about the same” as them…..which I think at least suggests a bit of selection bias.

As I’ve discussed previously, selection bias can be a big a big deal…and political polls are particularly susceptible to it. I mentioned Andrew Gelman previously, and he had a great article this week about his research on “systemic non-response” in political polling. He took a look at overall polling swings, and used various methods to see if he could differentiate between changes in candidate perception and changes in who picked up the phone. His data suggests that about 66-85% of polling swings are actually due to a change in the number of Republicans and Democrats who are willing to answer pollsters questions as opposed to a real change in perception. This includes widely reported on phenomena such as “post convention bounce” or “post debate effects”. This doesn’t mean the effects studied in these polls (or the studies I covered above) don’t exist at all, but that they may be an order of magnitude more subtle than suggested.

So whether you’re talking about ovulation or threats to male ego, I think it’s important to remember that who answers is just as important as what they answer. In this case 692 people were being used to represent the 5.27 million New Jersey voters, so any the potential for bias is, well, gonna be yuuuuuuuuuuuuuuuuuuge.

Selection bias and sampling theory are two of the most unappreciated issues in the popular consumption of statistics. While they present challenges for nearly every study ever done, they are often seen as boring….until something goes wrong. I was thinking about this recently because I was in a meeting on Friday and heard an absolutely stellar example of someone using selection bias quite cleverly to combat a tricky problem. I get to that story towards the bottom of the post, but first I wanted to go over some basics.

First, a quick reminder of why we sample: we are almost always unable to ask the entire population of people how they feel about something. We therefore have to find a way of getting a subset to tell us what we want to know, but for that to be valid that subset has to look like the main population we’re interested in. Selection bias happens when that process goes wrong. How can this go wrong? Glad you asked! Here’s 5 ways:

You asked a non-representative group Finding a truly “random sample” of people is hard. Like really hard. It takes time and money, and almost every researcher is short on both. The most common example of this is probably our personal lives. We talk to everyone around us about a particular issue, and discover that everyone we know feels the same way we do. Depending on the scope of the issue, this can give us a very flawed view of what the “general” opinion is. It sounds silly and obvious, but if you remember that many psychological studies rely exclusively on W.E.I.R.D. college students for their results, it becomes a little more alarming. Even if you figure out how to get in touch with a pretty representative sample, it’s worth noting that what works today may not work tomorrow. For example, political polling took a huge hit after the introduction of cell phones. As young people moved away from landlines, polls that relied on them got less and less accurate. The selection method stayed the same, it was the people that changed.
A non-representative group answered Okay, so you figured out how to get in touch with a random sample. Yay! This means good results, right? No, sadly. The next issue we encounter is when your respondents mess with your results by opting in or opting out of answering in ways that are not random. This is non-response bias, and basically it means “the group that answered is different from the group that didn’t answer”. This can happen in public opinion polls (people with strong feelings tend to answer more often than those who feel more neutrally) or even by people dropping out of research studies(our diet worked great for the 5 out of 20 people who actually stuck with it!). For health and nutrition surveys, people also may answer based on how good they feel about their response, or how interested they are in the topic. This study from the Netherlands,for example, found that people who drink excessively or abstain entirely are much less likely to answer surveys about alcohol use than those who drink moderately. There’s some really interesting ways to correct for this, but it’s a chronic problem for people who try to figure out public opinion.
You unintentionally double counted This example comes from the book Teaching Statistics by Gelman and Nolan. Imagine that you wanted to find out the average family size in your school district. You randomly select a whole bunch of kids and ask them how many siblings they have, then average the results. Sounds good, right? Well, maybe not. That strategy will almost certainly end up overestimating the average number of siblings, because large families are by definition going to have a better chance of being picked in any sample. Now this can seem obvious when you’re talking explicitly about family size, but what if it’s just one factor out of many? If you heard “a recent study showed kids with more siblings get better grades than those without” you’d have to go pretty far in to the methodology section before you might realize that some families may have been double (or triple, or quadruple) counted.
The group you are looking at self selected before you got there Okay, so now that you understand sampling bias, try mixing it with correlation and causation confusion. Even if you ask a random group and get responses from everyone, you can still end up with discrepancies between groups because of sorting that happened before you got there. For example, a few years ago there was a Pew Research survey that showed that 4 out of 10 households had female breadwinners, but that those female breadwinners earned less than male breadwinner households. However, it turned out that there were really 2 types of female breadwinner households: single moms and wives who outearned their husbands. Wives who outearned their husbands made about as much as male breadwinners, while single mothers earned substantially less. None of these groups are random, so any differences between them may have already existed.
You can actually use all of the above to your advantage. As promised, here’s the story that spawned this whole post: Bone marrow transplant programs are fairly reliant on altruistic donors. Registries that recruit possible donors often face a “retention” problem….i.e. where people initially sign up, then never respond when they are actually needed. This is a particularly big problem with donors under the age of 25, who for medical reasons are the most desirable donors. Recently a registry we work with at my place of business told us their new recruiting tactic used to mitigate this problem. Instead of signing people up in person for the registry, they get minimal information from them up front, then send them an email with further instructions about how to finish registering. They then only sign up those people who respond to the email. This decreases the number of people who end up registering to be donors, but greatly increases the number of registered donors who later respond when they’re needed. They use selection bias to weed out those who were least likely to be responsive….aka those who didn’t respond to even one initial email. It’s a more positive version of the Nigerian scammer tactic.

Selection bias can seem obvious or simple, but since nearly every study or poll has to grapple with it, it’s always worth reviewing. I’d also be remiss if I didn’t include a link here for those ages 18 to 44 who might be interested in registering to be a potential bone marrow donor.

graph paper diaries

because some of us need a few more lines to keep everything straight

Men, Masculinity Threats and Voting in 2016

Selection Bias: The Bad, The Ugly and the Surprisingly Useful

Share this:

Share this: