5 Interesting Things About IQ Self-Estimates

After my post last week about what goes wrong when students self-report their grades, the Assistant Village Idiot left a comment wondering about how this would look if we changed the topic to IQ. He wondered specifically about Quora, a question asking/answering website that has managed to spawn its own meta-genre of questions asking “why is this website so obsessed with IQ?“.

Unsurprisingly, there is no particular research done on specific websites and IQ self-reporting, but there is actually some interesting literature on people’s ability to estimate their own IQ and that of those around them. Most of this research comes from a British researcher from the University College London, Adrian Fuhrman.  Studying how well people actually know themselves kinda sounds like a dream job to me, so kudos to you Adrian. Anyway, ready for the highlights?

  1. IQ self estimates are iffy at best One of the first things that surprised me about IQ self-estimates vs actual IQ was how weak the correlation was. One study found an r=.3, another r=.19.  This data was gathered from people who first took a test, then were asked to estimate their results prior to actually getting them. In both cases, it appears that people are sort of on the right track, but not terrific at pinpointing how smart they are. One wonders if this is part of the reason for the IQ test obsession….we’re rightfully insecure about our ability to figure this out on our own.
  2. There’s a gender difference in predictions Across cultures, men tend to rank their own IQ higher than women do, and both genders consistently rank their male relatives (fathers, grandfathers and sons) as smarter than their female relatives (mothers, grandmothers and daughters). This often gets reported as male hubris vs female humility (indeed, that’s the title of the paper), but I note they didn’t actually compare it to results. Given that many of these studies are conducted on psych undergrad volunteers, is it possible that men are more likely to self select when they know IQ will be measured? Some of these studies had average IQ guesses of 120 (for women) and 127 (for men)….that’s not even remotely an average group, and I’d caution against extrapolation.
  3. Education may be a confounding factor for how we assess others One of the other interesting findings in the “rate your family member” game is that people rank previous generations as half a standard deviation less intelligent than they rank themselves. This could be due to the Flynn effect, but the other suggestion is that it’s hard to rank IQ accurately when educational achievement is discordant. Within a cohort, education achievement is actually pretty strongly correlated with IQ, so re-calibrating for other generations could be tricky.  In other words, if you got a master’s degree and your grandmother only graduated high school, you may think your IQ is further apart than it really is. To somewhat support this theory, as time has progressed, the gap between self rankings and grandparent rankings has closed. Interesting to think how this could also effect some of the gender effects seen in #2, particularly for prior generations.
  4. Being smart may not be the same as avoiding stupidity One of the more interesting studies I read looked at the correlation between IQ self-report and personality traits, and found that some traits made your more likely to think you had a high IQ. One of these traits was stability, which confused me because you don’t normally think of stable people as being overly high on themselves. When I thought about it for a bit though, I wonder if stable people were defining being “smart” as “not doing stupid things”.  Given that many stupid actions are probably more highly correlated with impulsiveness (as opposed to low IQ), this could explain the difference. I don’t have proof, but I suspect a stable person A with an IQ of 115 will mostly do better than an unstable person B with an IQ of 115, but person A may attribute this difference to intelligence rather than impulse control. It’s an academic distinction more than a practical one, but it could be confusing things a bit.
  5. Disagreeableness is associate with higher IQs, and self-perception of higher IQs  Here’s an interesting chicken and egg question for you: does having a high IQ make you more disagreeable or does being disagreeable make you think you have a higher IQ? Alternative explanation: is some underlying factor driving both? It turns out having a high IQ is associate both with being disagreeable and being disagreeable is associated with ranking your IQ as higher than others. This probably effects some of the IQ discussions to a certain degree….the “here’s my high IQ now let’s talk about it” crowd probably really is not as agreeable as those who want to talk about sports or exchange recipes.

So there you have it! My overall impression from reading this is that IQ is one of those things where people don’t appreciate or want to acknowledge small differences. In looking at some of the studies of where people ranking their parents against each other, I was surprised how many were pointing to a 15 point gap between parents, or a 10 point gap between siblings. Additionally, it’s interesting that we appear to have a pretty uneasy relationship with IQ tests in general. Women in the US for example are more likely to take IQ tests than men are but less likely to trust their validity. To confuse things further, they are also more likely to believe they are useful in educational settings. Huh? I’d be interested to see a self-estimated IQ compared to an actual understanding of what IQ is/is not, and then compare that to an actual scored IQ test. That might flesh out where some of these conflicting feelings were coming from.

5 Things You Should Know About Orchestras and Blind Auditions

Unless you were going completely off the grid this week, you probably heard about the now-infamous “Google memo“.  Written by a (since fired) 28 year old software engineer at Google, the memo is a ten page long document where the author lays out his beliefs about why gender gaps in tech fields continue to exist. While the author did not succeed in getting any policies at Google changed, he did manage to kick off an avalanche of hot takes examining whether the gender/tech gap is due to nature (population level differences in interests/aptitude) or nurture (embedded social structures that make women unwelcome in certain spaces). I have no particular interest in adding another take to the pile, but I did see a few references to the “blind orchestra auditions study” that reminded me I had been wanting to write about that one for a while, to deep dive in to a few things it did or did not say.

For those of you who don’t know what I’m talking about, here’s the run down: back in the 1970s, top orchestras in the US were 5% female. By the year 2000, the were up to almost 30% female. Part of the reason for the change was the introduction of “blind auditions”, where the people who were holding tryouts couldn’t see the identity of the person trying out. This finding normally gets presented without a lot of context, but it’s good to note someone actually did decided to study this phenomena to see if the two things really were related or not. They got their hands on all of the tryout data for quite a few major orchestras (they declined to name which ones, as it was part of the agreement of getting the data) and tracked what happened to individual musicians as they tried out. This led to a data set that had overall population trends, but also could be used to track individuals. You can download the study here, but these are my highlights:

  1. Orchestras are a good place to measure changing gender proportions, because orchestra jobs don’t change. Okay, first up is an interesting “control your variables” moment. One of the things I didn’t realize about orchestras (though may be should have) is that the top ones have not changed in size or composition in years. So basically, if you suddenly are seeing more women, you know it’s because the proportion of women overall is increasing across many instruments. In the words of the authors ” An increase in the number of women from, say, 1 to 10, cannot arise because the number of harpists (a female-dominated instrument), has greatly expanded. It must be because the proportion female within many groups has increased.”
  2. Blind auditions weren’t necessarily implemented to cut down on sexism. Since this study is so often cited in the context of sexism and bias, I had not actually ever read why blind auditions were implemented in the first place. Interestingly, according to the paper written about it, the actual initial concern was nepotism. Basically, orchestras were filled with their conductors students, and other potentially better players were shut out. When they opened the auditions up further, they discovered that when people could see who was auditioning, they still showed preferential treatment based on resume. This is when they decided to blind the audition, to make sure that all preconceived notions were controlled for. The study authors chose to focus on the impact this had on women (in their words) “Because we are able to identify sex, but no other characteristics for a large sample, we focus on the impact of the screen on the employment of women.”
  3. Blinding can help women out Okay, so first up, the most often reported findings: blind auditions appear to account for about 25% of the increase in women in major orchestras. When they studied individual musicians, they found that women who tried out in blind and non-blind auditions were more successful in the blinded auditions. They also found that having a blind final round increased the chances a woman was picked by about 33%. This is what normally gets reported, and it is a correct reporting of the findings.
  4. Blinding doesn’t always help women out One of the more interesting findings of the study that I have not often seen reported: overall, women did worse in the blinded auditions. As I mentioned up front, the study authors had the data for groups and for individuals, and the findings from #3 were pulled from the individual data. When you look at the group data, we actually see the opposite effect. The study authors suggest one possible explanation for this: adopting a “blind” process dropped the quality of the female candidates. This makes a certain amount of sense. If you sense you are a borderline candidate, but also think there may be some bias against you, you would be more likely to put your time in to an audition where you knew the bias factor would be taken out. Still, that result interested me.
  5. The effects of blinding can depend on the point in the process Even after controlling for all sorts of factors, the study authors did find that bias was not equally present in all moments. For example, they found that blind auditions seemed to help women most in preliminary and final rounds, but it actually hurt them in the semi-final rounds. This would make a certain amount of sense….presumably people doing the judging may be using different criteria in each round, and some of those may be biased in different ways than others. Assuming that all parts of the process work the same way is probably a bad assumption to make.

Overall, while the study is potentially outdated (from 2001…using data from 1950s-1990s), I do think it’s an interesting frame of reference for some of our current debates. One article I read about it talked about the benefit of industries figuring out how to blind parts of their interview process because it gets them to consider all sorts of different people….including those lacking traditional educational requirements. With many industries dominated by those who went to exclusive schools, hiding identity could have some unexpected benefits for all sorts of people. However, as this study also shows, it’s probably a good idea to keep the limitations of this sort of blinding in mind. Even established bias is not a consistent force that produces identical outcomes at all time points, and any measure you institute can quickly become a target that changes behavior.  Regardless, I think blinding is a good thing. All of us have our own pitfalls, and we all might be a little better off if we see our expectations toppled occasionally.

4 Examples of Confusing Cross-Cultural Statistics

In light of my last post about variability in eating patterns across religious traditions, I thought I’d put together a few other examples of times when attempts to compare data across international borders got a little more complicated than you would think.

Note: not all of this confusion changed the conclusions that people were trying to get to, but it did make things a little confusing.

  1. Who welcomes the refugee  About a year or so ago, when Syrian refugees were making headlines, there was a story going around that China was the most welcoming country for people fleeing their homeland. The basis of the story was an Amnesty International survey that showed a whopping 46% of Chinese citizens saying they would be willing to take a refugee in to their home…..far more than any other country. The confusion arose when a Quartz article pointed out that there is no direct Chinese translation for the word “refugee” and the word used in the survey meant “person who has suffered a calamity” without clarifying whether that person is international or lives down the street. It’s not clear how this translation may have influenced the response, but a different question on the same survey that made the “international” part clearer received much lower support.
  2. The French Paradox (reduced by 20%) In the process of researching my last post, I came across a rather odd tidbit I’d never heard of before regarding the “French Paradox”. A term that originated in the 80s, the French Paradox is the apparent contradiction that French people eat lots of cholesterol/saturated fat and yet don’t get heart disease at the rates you would expect based on data from other countries. Now I had heard of this paradox before, but the part I hadn’t heard  was the assertion that French doctors under-counted deaths from coronary heart disease. When you compared death certificates to data collected by more standardized methods, they found that this was true:

    They suspect the discrepancy arose because doctors in many countries automatically attribute sudden deaths in older people to coronary heart disease, whereas the French doctors were only doing so if they had clinical evidence of heart disease. This didn’t actually change the rank of France very much; they still have a lower than expected rate of heart disease. However, it did nearly double the reported incidence of CHD and cuts the paradox down by about 20%.

  3. Crime statistics of all sorts This BBC article is a few years old, but it has some interesting tidbits about cross-country crime rate comparisons. For example, Canada and Australia have the highest kidnapping rates in the world. The reason? They count all parental custody disputes as kidnappings, even if everyone knows where the child is. Other countries keep this data separate and only use “kidnapping” to describe a missing child. Countries that widen their definitions of certain crimes tend to see an uptick in those crimes, like Sweden saw with rape when it widened its definition in 2005.
  4. Infant mortality  This World Health Organization report has some interesting notes about how different countries count infant mortality, and it notes that some countries (such as Belgium, France and Spain) only count infant mortality in infants who survive beyond a certain time period after birth, such as 24 hours. Those countries tend to have lower infant mortality rates but higher stillbirth rates than countries that don’t set such a cutoff. Additionally, as of 2008 approximately 3/4 of countries lack the infrastructure to count infant mortality through hospitals and do so through household surveys instead.

Like I said, not all of these change the conclusions people come to, but they are good things to keep in mind.

5 Things You Should Know About the “Backfire Effect”

I’ve been ruminating a lot on truth and errors this week, so it was perhaps well timed that someone sent me this article on the “backfire effect” a few days ago. The backfire effect is a name given to a psychological phenomena in which attempting to correct someone’s facts actually increases their belief in their original error. Rather than admit they are wrong when presented with evidence they narrative goes, people double down. Given the current state of politics in the US, this has become a popular thing to talk about. It’s popped up in my Facebook feed and is commonly cited as the cause of the “post-fact” era.

So what’s up with this? Is it true that no one cares about facts any more? Should I give up on this whole facts thing and find something better to do with my time?

Well, as with most things, it turns out it’s a bit more complicated than that. Here’s a few things you should know about the state of this research:

  1. The most highly cited paper focused heavily on the Iraq War The first paper that made headlines was from Nyhan and Reifler back in 2010, and was performed on college students at a Midwest Catholic University. They presented some students with stories including political misperceptions, and some with stories that also had corrections. They found that the students that got corrections were more likely to believe the original misperception. The biggest issue this showed up with was whether or not WMDs were found in Iraq. They also tested facts/corrections around the tax code and stem cell research bans, but it was the WMD findings that grabbed all the headlines. What’s notable is that the research was performed in 2005 and 2006, when the Iraq War was heavily in the news.
  2. The sample size was fairly small and composed entirely of college students One of the primary weaknesses of the first papers (as stated by the authors themselves) is that 130 college students are not really a representative sample. The sample was half liberal and 25% conservative. It’s worth noting that they believe that was a representative sample for their campus, meaning all of the conservatives were in an environment where they were the minority. Given that one of the conclusions of the paper was that conservatives seemed to be more prone to this effect than liberals, it’s an important point.
  3. A new paper with a broader sample suggest the “backfire effect” is actually fairly rare. Last year, two researchers (Porter and Wood) polled 8,100 people from all walks of life on 36 political topics and found…..WMDs in Iraq were actually the only issue that provoked a backfire effect. A great Q&A with them can be found here. This is fascinating if it holds up because it means the original research was mostly confirmed, but any attempt at generalization was pretty wrong.
  4. When correcting facts, phrasing mattered One of the more interesting parts of the Porter/Wood study was when the researchers described how they approached their corrections. In their own words “Accordingly, we do not ask respondents to change their policy preferences in response to facts–they are instead asked to adopt an authoritative source’s description of the facts, in the face of contradictory political rhetoric“. They reject heartily “corrections” that are aimed at making people change their mind on a moral stance (like say abortion) and focus only on facts. Even with the WMD question they found that the more straightforward and simple the correction statement, the more people of all political persuasions accepted it.
  5. The 4 study authors are now working together In an exceptionally cool twist, the authors who came to slightly different conclusions are now working together. The Science of Us gives the whole story here, but essentially Nyhan and Reifler praised Porter and Wood’s work, then said they should all work together to figure out what’s going on. They apparently gathered a lot of data during the height of election season and hopefully we will see those results in the near future.

I think this is an important set of points, both because it’s heartwarming (and intellectually awesome!) to see senior researchers accepting that some of their conclusion may be wrong and actually working with others to improve their own work. Next, I think it’s important because I’ve heard a lot of people in my personal life commenting that “facts don’t work” so they basically avoid arguing with those who don’t agree with them. If it’s true that facts DO work as long as you’re not focused on getting someone to change their mind on the root issue, then it’s REALLY important that we know that. It’s purely anecdotal, but I can note that this has been my experience with political debates. Even the most hardcore conservatives and liberals I know will make concessions if you clarify you know they won’t change their mind on their moral stance.

5 Things You Should Know About Statistical Process Control Charts

Once again I outdo myself with the clickbait-ish titles, huh? Sorry about that, I promise this is actually a REALLY interesting topic.

I was preparing a talk for a conference this week (today actually, provided I get this post up when I plan to), and I realized that statistical process control charts (or SPC charts for short) are one of the tools I use quite often at work but don’t really talk about here on the blog. Between those and my gif usage, I think you can safely guess why my reputation at work is a bit, uh, idiosyncratic. For those of you who have never heard of an SPC chart, here’s a quick orientation. First, they look like this:

(Image from qimacros.com, and excellent software for generating these)

The chart is used for plotting something over time….hours, days, weeks, quarters, years, or “order in line”…take your pick.  Then you map some ongoing process or variable you are interested in…..say employee sick calls. You measure employee sick calls in some way (# of calls or % of employees calling in) in each time period. This sets up a baseline average, along with “control limits”, which are basically 1, 2 and 3 standard deviation ranges. If at some point your rate/number/etc starts to go up or down, the SPC chart can tell you if the change is significant or not based on where it falls on the plot.  For example, if you have one point that falls outside the 3 standard deviation line, that’s significant. If two in a row fall outside the 2 standard deviation line, that’s significant as well. The rules for this vary by industry, and Wiki gives a pretty good overview here. At the end of this exercise you have a really nice graph of how you’re doing with a good visual of any unusual happenings, all with some statistical rigor behind it. What’s not to love?

Anyway, I think because they take a little bit of getting used to,  SPC charts do not always get the love they deserve. I would like to rectify this travesty, so here’s 5 things you should know about them to tempt you to go learn more about them:

  1. SPC charts are probably more useful for most business than hypothesis testing While most high school level statistics classes at least take a stab at explaining p-values and hypothesis testing to kids, almost none of them even show an example of a control chart. And why not? I think it’s a good case of academia favoring itself. If you want to test a new idea against an old idea or to compare two things at a fixed point in time p-values and hypothesis testing are pretty good. That’s why they’re used in most academic research. However, if you want see how things are going over time, you need statistical process control. Since this is more relevant for most businesses, people who are trying to keep track of any key metric should DEFINITELY know about these.   Six Sigma and many process improvement class teach statistical process control, but they still don’t seem widely used outside of those settings. Too bad. These graphs are  practical, they can be updated easily, and it gives you a way of monitoring what’s going on and lot of good information about how your process are going. Like what? Well, like #2 on this list:
  2. SPC charts track two types of variation Let’s get back to my sick call example. Let’s say that in any given month, 10% of your employees call in sick. Now most people realize that not every month will be exactly 10%. Some months it’s 8%, some months it’s 12%. What statistical process control charts help calculate is when those fluctuations are most likely just random (known as common cause variation) and the point at which they are probably not so random (special cause variation). It sets parameters that tell you when you should pay attention. They are better than p-values for this because you’re not really running an experiment every month….you just want to make sure everything’s progressing as it usually does. The other nice part is this translates easily in to a nice visual for people, so you can say with confidence “this is how it’s always been” or “something unusual is happening here” and have more than your gut to rely on.
  3. SPC charts help you test new things, or spot concerning trends quickly SPC charts were really invented for manufacturing plants, and were perfected and popularized in post-WWII Japan. One of the reasons for this is that they really loved having an early warning about when a machine might be breaking down or an employee might not be following the process. If the process goes above or below a certain red line (aka the “upper/lower control limit”) you have a lot of confidence something has gone wrong and can start investigating right away. In addition to this, you can see if a change you made helps anything. For example, if you do a handwashing education initiative, you can see what percentage of your employees call in sick the next month. If it’s below the lower control limit, you can say it was a success, just like with traditional p-values/hypothesis testing. HOWEVER, unlike p-values/hypothesis testing, SPC charts make allowances for time. Let’s say you drop the sick calls to 9% per month, but then they stay down for 7 months. Your SPC chart rules now tell you you’ve made a difference. SPC charts don’t just take in to account the magnitude of the change, but also the duration. Very useful for any metric you need to track on an ongoing basis.
  4. They encourage you not to fix what isn’t broken One of the interesting reasons SPC charts caught on so well in the manufacturing world is that the idea of “opportunity cost” was well established. If your assembly line puts out a faulty widget or two, it’s going to cost you a lot of money to shut the whole thing down. You don’t want to do that unless it’s REALLY broken. For our sick call example, it’s possible that what looks like an increase (say to 15% of your workforce) isn’t a big deal and that trying to interfere will cause more harm than good. Always good to remember that there are really two ways of being wrong: missing a problem that does exist, and trying to fix one that doesn’t.
  5. There are quite a few different types One of the extra nice things about SPC charts is that there are actually 6 types to chose from, depending on what kind of data you are working with. There’s a helpful flowchart to pick your type here, but a good computer program (I use QI macros) can actually pick for you. One of the best parts of this is that some of them can deal with small and varying sample sizes, so you can finally show that going from 20% to 25% isn’t really impressive if you just lowered your volume from 5 to 4.

So those are some of my reasons you should know about these magical little charts. I do wish they’d get used more often because they are a great way of visualizing how you’re doing on an ongoing basis.

If you want to know more about the math behind them and more uses (especially in healthcare), try this presentation. And wish me luck on my talk! Pitching this stuff right before lunch is going to be a challenge.

Moral Outrage, Cleansing Fires and Reasonable Expectations

Last week, the Assistant Village Idiot forwarded me a new paper called “A cleansing fire: Moral outrage alleviates guilt and buffers threats to one’s moral identity“. It’s behind a ($40) paywall, but Reason magazine has an interesting breakdown of the study here, and the AVI does his take here. I had a few thoughts about how to think about a study like this, especially if you don’t have access to the paper.

So first, what did the researchers look at and what did they find? Using Mechanical Turk, the researchers had subject read articles that talked about either labor exploitation in other countries or the effects of climate change. They found that personal feelings of guilt about those topics predicted greater outrage at a third-party target, a greater desire to punish that target, and that getting a chance to express that outrage decreased guilt and increased feelings of personal morality. The conclusion being reported is (as the Reason.com headline says) “Moral outrage is self-serving” and “Perpetually raging about the world’s injustices? You’re probably overcompensating.”
.

So that’s what’s being reported.  So how do we think through this when we can’t see the paper? Here’s 5 things I’d recommend:

  1. Know what you don’t know about sample sizes and effect sizes Neither the abstract nor the write ups I’ve seen mention how large the effects reported were or how many people participated. Since it was a Mechanical Turk study I am assuming the sample size was reasonable, but the effect size is still unknown. This means we don’t know if it’s one of those unreasonably large effect sizes that should alarm you a bit or one of those small effect sizes that is statically but not practically significant. Given that reported effect size heavily influences the false report probability, this is relevant.
  2. Remember the replication possibilities Even if you think a study found something quite plausible, it’s important to remember that fewer than half of psychological studies end up replicating exactly as the first paper reported. There are lots of possibilities for replication, and even if the paper does replicate it may end up with lots of caveats that didn’t show up in the first paper.
  3. Tweak a few words and see if your feelings change Particularly when it comes to political beliefs, it’s important to remember that context matters. This particular studies calls to mind liberal issues, but do we think it applies to conservative issues too? Everyone has something that gets them upset, and it’s interesting to think through how that would apply to what matters to us. When the Reason.com commenters read the study article, some of them quickly pointed out that of course their own personal moral outrage was self serving. Free speech advocates have always been forthright that they don’t defend pornographers and offensive people because they like those people, but because they want to preserve free speech rights for themselves and others. Self serving moral outrage isn’t so bad when you put it that way.
  4. Assume the findings will get more generic In addition to the word tweaks in point #3, it’s likely that subsequent replications will tone down the findings. As I covered in my Women Ovulation and Voting post, 3 studies took findings from “women change their vote and values based on their menstrual cycle” to “women may exhibit some variation in face preference based on menstrual cycle”. This happened because some parts of the initial study failed to replicate, and some caveats got added. Every study that’s done will draw another line around the conclusions and narrow their scope.
  5. Remember the limitations you’re not seeing One of the most important parts of any papers is where the authors discuss the limitations of their own work. When you can’t read the paper, you can’t see what they thought their own limitations where. Additionally, it’s hard to tell if there were any interesting non-findings that didn’t get reported. The limitations that exist from the get go give a useful indication of what might come up in the future.

So in other words….practice reasonable skepticism. Saves time, and the fee to read the paper.

Who Votes When? Untangling Non-Citizen Voting

Right after the election, most people in America saw or heard about this Tweet from then President elect Trump:

I had thought this was just random bluster (on Twitter????? Never!), but then someone sent me  this article. Apparently that comment was presumably based on an actual study, and the study author is now giving interviews. It turns out he’s pretty unhappy with everyone….not just with Trump, but also with Trump’s opponents who claim that no non-citizens voted. So what did his study actually say? Let’s take a look!

Some background: The paper this is based on is called “Do Non-Citizens Vote in US Elections” by Richman et all and was published back in 2014. It took data from a YouGov survey and found that 6.4% of non-citizens voted in 2008 and 2.2% voted in 2010. Non-citizenship status was based on self report, as was voting status, though the demographic data of participants was checked with that of their stated voting district to make sure the numbers at least made sense.

So what stood out here? A few things:

  1. The sample size While the initial survey of voters was pretty large (over 80,000 between the two years) the number of those identifying themselves as non-citizens was rather low: 339 and 489 for the two years. There were a total of 48 people who stated that they were not citizens and that they voted. As a reference, it seems there are about 20 million non-citizens currently residing in the US.
  2. People didn’t necessarily know they were voting illegally One of the interesting points made in the study was that some of this voting may be unintentional. If you are not a citizen, you are never allowed to vote in national elections even if you are a permanent resident/have a green card. The study authors wondered if some people  didn’t know this, so they analyzed the education levels of those non-citizens who voted. It turns out non-citizens with less than a high school degree are more likely to vote than those with more education. This actually is the opposite trend seen among citizens AND naturalized citizens, suggesting that some of those voters have no idea what they’re doing is illegal.
  3. Voter ID checks are less effective than you might think If you’re first question up on reading #2 was “how could you just illegally vote and not know it?” you may be presuming your local polling place puts a little more in to screening people than they do. According to the participants in this study, not only were non-citizens allowed to register and cast a vote, but a decent number of them actually passed an ID check first. About a quarter of non-citizen voters said they were asked for ID prior to voting, and 2/3rds of those said they were then allowed to vote. I suspect this issue is that most polling places don’t actually have much to check their information against. Researching citizenship status would take time and money that many places just don’t have. Another interesting twist to this is that social desirability bias may kick in for those who don’t know voting is illegal. Voting is one of those things more people say they do than actually do, so if someone didn’t know they couldn’t legally vote they’d be more likely to say they did even if they didn’t. Trying to make ourselves look good is a universal quality.
  4. Most of the illegal voters were white Non-citizen voters actually tracked pretty closely with their proportion of the population, and about 44% of them were white. The next most common demographic was Hispanic at 30%, then black, then Asian. In terms of proportion, the same percent of white non-citizens voted as Hispanic non-citizens.
  5. Non-citizens are unlikely to sway a national election, but could sway state level elections When Trump originally referenced this study, he specifically was using it to discuss national popular vote results. In the Wired article, they do the math and find that even if all of the numbers in the study bear out it would not sway the national popular vote. However, the original study actually drilled down to a state level and found that individual states could have their results changed by non-citizen voters. North Carolina and Florida would both have been within the realm of mathematical possibility for the 2008 election, and for state level races the math is also there.

Now, how much confidence you place in this study is up to you. Given the small sample size, things like selection bias and non-response bias definitely come in to play. That’s true any time you’re trying to extrapolate the behavior of 20 million people off of the behavior of a few hundred. It is important to note that the study authors did a LOT of due diligence attempting to verifying and reality check the numbers they got, but it’s never possible to control for everything.

If you do take this study seriously, it’s interesting to note what the authors actually thought the most effective counter-measure against non-citizen voting would be: education. Since they found that low education levels were correlated with increased voting and that poll workers rarely turned people away, they came away from this study with the suggestion that simply doing a better job of notifying people of voting rules might be just as effective (and cheaper!) than attempting to verify citizenship. Ultimately it appears that letting individual states decide on their own strategies would also be more effective than anything on the federal level, as different states face different challenges. Things to ponder.