10 GIFs for Stats/Data People

Nope, this isn’t a gifts post, it is a GIFs post! It occurred to me this past week that one of the things I’m fairly well known for at work and in my personal life is my absolute dedication to gif usage. I send them as often as I can get away with at work (this shows up as “frequently employees novel communication methods to get her point across” on my review, if you’re curious), and I use them pretty regularly in personal emails, particularly around Fantasy Football/Game of Thrones Fantasy League Season. As such, it is a little weird that I almost never use them on my blog unless Ben’s involved. Well, that’s changing today! Here are 10 gifs that I use (or want to remember to use) in stats and data situations. While it will never have the market share of therapeutic geometry porn, I get a kick out of them:

  1. When you’ve been sitting through a really boring presentation full of opinions and theory, and someone finally gets to some numbers and evidence:                                              
  2. When someone’s trying to walk you through some risk assessments, but you’re pretty sure they’re mucking with definitions, confused about probability and independence, and you just want to do the math yourself: 
  3. When you’ve been working really hard on a pet theory, and your data is on point, your effect sizes look good and…..no dice:  Time to run a subgroup analysis!
  4. When you see some amazing data well used and it just makes you fundamentally happy: 
  5. When someone in a meeting uses a ridiculous statistic they clearly haven’t thought through or don’t understand, and you need to send something to your coworker who you just know understands your angst: 
  6. When you’ve done every analysis possible, in every iteration possible, and you can’t find a significant correlation between two things, but then someone asks if you’re 100% there’s actually no relationship between the two variables and you start trying to explain p-values and all they hear is: 
  7. When you import your data in to a new file type and suddenly everything just goes haywire: 
  8. When you’ve been working for hours on your SAS/R code and you’re waiting for it to run and goddammit this better work:
  9. When someone says “gee, I wish we had that data….” and you realize that you actually already pulled it together just for fun, and you’re so excited to say you have it:
  10. …..and then when you realize this makes you sound like an absolute crazy person: 

Got one I missed? Let me know!

Calling BS Read-Along Week 2: Spotting BS

Welcome to the Calling Bullshit Read-Along based on the course of the same name from Carl Bergstorm and Jevin West  at the University of Washington. Each week we’ll be talking about the readings and topics they laid out in their syllabus. If you missed my intro, click here or if you want to go back to Week 1 click here.

Hey hey! Welcome back! It is week 2 of the read-along, and we’ve got some good stuff going on today. After spending last week learning what bullshit is, this week we’re going to focus on how to spot it in the wild. This is well timed because a few days ago I had a distressing discussion with a high school teacher-friend who had assigned her kids some of my Intro to Internet Science posts as a prelude to a section on fake news. She had asked them to write an essay about the topic of “fake stuff on the internet” before the discussion in class, and apparently more than a few of them said something to the effect of “that’s nice, but I’ve never heard of fake news so this is not a problem in my life”. Groooooooooooooooooooooooooooan 

Of course the problem with bullshit  is that no one warns you you’re going to see it, and no one slaps you afterwards and says “you just read that uncritically”.  With so much of the bullshit these days being spread by social media, inattentional blindness is in high gear. If 50% of study participants can’t see a gorilla when they’re trying to watch a bouncing ball, what makes you think you’re going to correctly spot bullshit while you’re trying to post pictures/think of funny status updates/score a political point against your uncle/see how your ex is doing????

The only hope is to teach yourself some ticks and remain eternally vigilant. In other words (and with apologies to Hunter S Thompson): I hate to advocate pessimism, skeptical toolkits, the GRIM test and constant paranoia, but they’ve always worked for me.

With that intro, let’s get to the readings! First up is Chapter 12 of Carl Sagan’s Demon Haunted World: Science as a Candle in the Dark: The Fine Art of Baloney Detection. I don’t think I’d read this chapter since I first read this book maybe 15 years ago or so, so it was a lot of fun to read again. Sagan starts by making a differentiation that will be familiar to those who read last week’s piece: those who believe common misconceptions vs those who promote them professionally. The example he uses is being able to contact the dead. He admits to his own longing to talk to his deceased parents  and how much appeal the belief that sometimes you can “feel” the dead has to most of us. As an atheist, he firmly believed the idea of life after death was baloney, but he gives a pass to the large number of people who believe in life after death or even those who believe they’ve had contact with the dead in their personal lives. To him, those beliefs are normal and even if you don’t think they are true or rational, they are hard to criticize. Where his wrath kicks in is those who seek to make money off of promoting this stuff and encouraging people to believe in irrational things, like psychics or mediums. He believes that undermining a society’s ability and desire to seek out independent truth and facts is one of the worst things a person can do. This isn’t just psychics doing this of course, but most of the advertising world as well, who will throw any “fact” at you if you just buy their product. In response to this constant barrage of misinformation and misdirection, he offers a “tool kit” for skeptical thinking. The whole thing is on the 4th and 5th page, but the short version is this:

  • Get independent confirmation of facts
  • Encourage debate
  • Don’t trust authority blindly
  • Come up with multiple possible explanations
  • Don’t stick to one explanation just because it is the one you thought of
  • Find something to quantify, which makes everything easier to compare
  • Make sure the whole chain of the argument works. Don’t let people mumble through part of it.
  • Prefer simple explanations (Occam’s razor)
  • Look for something falsifiable. If something can never be proven wrong, it is, well, never going to be proven wrong.
  • Keep a friendly statistician around at all times

Okay, fine, that last one’s mine, not Sagan’s, but he does come out swinging for well designed experiments. He also includes a really helpful list of the most common logical fallacies (if you want a nice online version, try this one). He concludes with a discussion of corporate advertising, sponsored research, and tobacco companies. Confusing science and skewed research helped promote tobacco for much longer than it should have stuck around.

With the stage set by Sagan, the rest of the readings include some specific tips and tricks to spot various issues with numbers and data. Some are basic plausibility checks, and some are more advanced. These are:

The “what does this number even mean” check: Last week we talked about bullshit as “unclarifiable unclarity”, and this case study is a good example of doing that with numbers. Written by West and Bergstrom, this example looks at a packet of hot cocoa that claims to be “99.9% caffeine free”.  It is not so much that the claim is implausible or even inaccurate, but that it is completely meaningless. If you’re measuring by weight, even a highly caffeinated drink will be mostly “caffeine free”. While it is likely the cocoa actually is low caffeine, this statistic doesn’t give you much insight. It is the appearance of information without any actual substance.

Fermi estimations: A technique named after Enrico Fermi, its focus is to get people to focus on getting people to guess numbers based on the order of magnitude (ie 10 vs 100 vs 1000, etc), not the exact number. When doing rough calculations with large numbers, this can actually yield surprisingly accurate results. To play around with making these estimates, they provide a link to this game here. There’s a good book on this and how to solve problems like “how many piano tuners work in New York City?” called Guesstimation if you’re really in to it.

Being able to focus in on the order of magnitude is surprisingly helpful in spotting bullshit, as is shown in the case study of food stamp fraud numbers. A news report from Fox News says that food stamp fraud costs tax payers $70 million dollars a year, and asked if this level of fraud means it is time to end food stamps. If we take that number at face value, is this a big deal? Using Fermi estimations, you can figure out a ballpark number for total food stamp payouts, and determine that this loss would be around .2% of all benefits paid. That is really close to the number you get if you dig up all the real numbers: .09% of all benefits paid.

GRIM testing: Edging in to the deeper end of the pool, this is a neat little trick that mostly has applications for those reviewing studies with small sample sizes. GRIM stands for “granularity-related inconsistency of means” test, and it is a way of quickly and easily looking for data problems. The full explanation (plus the fascinating story of its development) is here, but here’s the quick version: if your sample size is small and you are counting whole numbers, your mean has to end in very predictable decimal places. If it doesn’t, something’s wrong. For example, a study says that 10 people reported having an average of 2.24 children is bogus. Why? Because 2.24= total number of kids/10, and the total number of kids would have to be 22.4. There are a lot of possible explanations for this, but most of them get down the types of sloppiness or confusion that might make you question other parts of the paper.

By the way, if you want to leave the deep end of the pool and dive right in to the ocean, the author of the GRIM test has a SPRITE test that deals with the implications of standard deviations.

Newcomb-Benford Law: This law is one of my favorites because it was spotted back in 1881 for a reason that simply wouldn’t happen today: uneven wear on books. Back when slide rules were scarce and people had to actually look through a book of numbers to figure out what a logarithm for a certain value was,  an astronomer named Simon Newcomb noticed that the books were really worn out in the first sections where the numbers that started with low numbers were, and rather clean in the back where the leading digits were higher. He began to wonder if “random” numbers found in nature were more likely to start with small digits than large ones, then he just decided to declare it was so and said that the probability that the leading digits was a certain value d was equal to the log((d+1)/d). Basically, a random number like the population of a country will have a 30% chance of starting with 1, and only a 5% chance of starting with a 9.

Despite having very little proof other than a worn out book, it turns out this law is actually pretty true. Machine generated data can gum up the works a bit, but natural phenomena tend to follow this rule. Benford got his name in there by pulling data from hundreds of sources: rivers, populations, physical constants, even random numbers from the pages of Reader’s Digest and categorizing them by leading digit. He got 20,000 numbers together and found that low leading digits simply WERE more common. The proposed mathematical explanations for this are not light reading no matter what they promise, but it is pretty much enough to know that it is a thing. It has been used to detect election fraud and is also used in forensic accounting, but basically all the layperson needs to know is that numbers lists that start with high digits aren’t as plausible as those that start with low ones.

And one more for the road: It is worth noting that there is actually another Benford Law that would be not-irrelevant in a course like this. Benford’s Law of Controversy states that “passion is inversely proportional to the amount of real information available”.

All of these tricks may seem like a lot to keep in mind, so if you want some practice take the advice I give to the high school students: find a cause you really care about and go read bad arguments or propaganda from the “other side”. As I’ve mentioned before, your ability to do math improves dramatically when said math helps you prove a point you feel emotionally attached to. Using this to your advantage while learning these tricks might help you get them down a little faster. Of course the problem with learning these tricks is that unless you’re entirely hypocritical, eventually you might have to turn them around on your own side, so be forewarned of that.To this day the high point of my blogging career is when my political activist brother left me a voicemail screaming “I JUST LEFT A MEETING WITH PEOPLE I LIKE MAKING A POINT I AGREE WITH BUT THEY USED BAD STATISTICS THAT I FIGURED OUT WERE WRONG AND I COULDN’T STOP STARING AT THEM AND I HAD TO CORRECT THEM IN FRONT OF EVERYONE AND THEN THEY TOLD ME IT DIDN’T MATTER AND NOW I’M MAD AT THEM AND YOU!!!!”.

So what am I taking away from this week? A few things:

  1. Even if you’re not a “numbers person”, a good sense of how numbers work can go a long way towards checking the plausibility of a claim
  2. Paranoia is just good sense if people really are out to get you. People who are trying to sell you something are not the most trustworthy sources
  3. Math tricks are fun
  4. People named Benford come up with an unusual number of bullshit related laws

I’m still checking that last one out, but it seems plausible.

And that wraps up this week! Next week we’ll be wallowing in “the natural ecology of bullshit”, so make sure you meander back next Sunday for that. Bring boots. It’ll be fun.

Week 3 is now up! Read it here.

Moral Outrage, Cleansing Fires and Reasonable Expectations

Last week, the Assistant Village Idiot forwarded me a new paper called “A cleansing fire: Moral outrage alleviates guilt and buffers threats to one’s moral identity“. It’s behind a ($40) paywall, but Reason magazine has an interesting breakdown of the study here, and the AVI does his take here. I had a few thoughts about how to think about a study like this, especially if you don’t have access to the paper.

So first, what did the researchers look at and what did they find? Using Mechanical Turk, the researchers had subject read articles that talked about either labor exploitation in other countries or the effects of climate change. They found that personal feelings of guilt about those topics predicted greater outrage at a third-party target, a greater desire to punish that target, and that getting a chance to express that outrage decreased guilt and increased feelings of personal morality. The conclusion being reported is (as the Reason.com headline says) “Moral outrage is self-serving” and “Perpetually raging about the world’s injustices? You’re probably overcompensating.”
.

So that’s what’s being reported.  So how do we think through this when we can’t see the paper? Here’s 5 things I’d recommend:

  1. Know what you don’t know about sample sizes and effect sizes Neither the abstract nor the write ups I’ve seen mention how large the effects reported were or how many people participated. Since it was a Mechanical Turk study I am assuming the sample size was reasonable, but the effect size is still unknown. This means we don’t know if it’s one of those unreasonably large effect sizes that should alarm you a bit or one of those small effect sizes that is statically but not practically significant. Given that reported effect size heavily influences the false report probability, this is relevant.
  2. Remember the replication possibilities Even if you think a study found something quite plausible, it’s important to remember that fewer than half of psychological studies end up replicating exactly as the first paper reported. There are lots of possibilities for replication, and even if the paper does replicate it may end up with lots of caveats that didn’t show up in the first paper.
  3. Tweak a few words and see if your feelings change Particularly when it comes to political beliefs, it’s important to remember that context matters. This particular studies calls to mind liberal issues, but do we think it applies to conservative issues too? Everyone has something that gets them upset, and it’s interesting to think through how that would apply to what matters to us. When the Reason.com commenters read the study article, some of them quickly pointed out that of course their own personal moral outrage was self serving. Free speech advocates have always been forthright that they don’t defend pornographers and offensive people because they like those people, but because they want to preserve free speech rights for themselves and others. Self serving moral outrage isn’t so bad when you put it that way.
  4. Assume the findings will get more generic In addition to the word tweaks in point #3, it’s likely that subsequent replications will tone down the findings. As I covered in my Women Ovulation and Voting post, 3 studies took findings from “women change their vote and values based on their menstrual cycle” to “women may exhibit some variation in face preference based on menstrual cycle”. This happened because some parts of the initial study failed to replicate, and some caveats got added. Every study that’s done will draw another line around the conclusions and narrow their scope.
  5. Remember the limitations you’re not seeing One of the most important parts of any papers is where the authors discuss the limitations of their own work. When you can’t read the paper, you can’t see what they thought their own limitations where. Additionally, it’s hard to tell if there were any interesting non-findings that didn’t get reported. The limitations that exist from the get go give a useful indication of what might come up in the future.

So in other words….practice reasonable skepticism. Saves time, and the fee to read the paper.

Calling BS Read-Along Week 1: Intro to BS

Welcome to the Calling Bullshit Read-Along based on the course of the same name from Carl Bergstorm and Jevin West  at the University of Washington. Each week we’ll be talking about the readings and topics they laid out in their syllabus. If you missed my intro, click here.

Well hello hello and welcome to Week 1 of the Read-Along! Before we get started I wanted to give a shout out to the Calling Bullshit Twitter feed, and not just because they informed me yesterday that they are jealous of my name. They post some useful stuff over there, so check them out.

We’re kicking off this thing with an Introduction to Bullshit.   Now you may think you and bullshit are already well acquainted, but it never hurts to set some definitions up front. The first reading is a quick blog post that explains what is commonly known as either “Brandolini’s Law” or “The Bullshit Asymmetry Principle”, which states that “The amount of energy needed to refute bullshit is an order of magnitude bigger than to produce it”. 

Even if you didn’t know there was a name for this, you know the feeling: you’re in a political discussion when someone decides to launch in to something absolutely crazy about “the other side”. Feeling defensive, you look up whatever it is their talking about to find evidence to refute it. Even with a smartphone this can take a few minutes. You find one source disagreeing with them, they declare it biased. You find another, it’s not well sourced enough. One more, from a credible person/publication who is normally on “their side” aaaaand…they drop it with a shrug and mumble that it wasn’t that important anyway. That’s 5-10 minutes of your life gone over something it took them less than 30 seconds to blurt out. Ugh.

Okay, so we all know it when we see it…..but what is bullshit? The obvious answer is to go with the precedent established in Jacobellis vs Ohio and merely declare that “I know it when we see it“, but somehow I doubt that will get you full credit on the exam. If we’re going to spend a whole semester looking at this, we’re going to have to get more specific. Luckily since bullshit is not a new phenomena, there’s actually some pre-existing literature on the topic. One of the better known early meditations on the topic is from 1986 and is simply called “On Bullshit“. For all my readers who are pedantic word nerds (and I know there’s more than one of you!) I recommend this, if only for the multiple paragraphs examining whether “humbug!” and “bullshit!” are interchangeable or not. That discussion led me to the transcript of the 1980 lecture “On the Prevalence of Humbug” by Max Black, which is not in the course but also worth a read.

Now “humbug” isn’t used commonly enough for me to have a real opinion about what it means, but Frankfurt uses it to set an important stage: “humbug” is not just about misrepresenting something, but also about your reasons for doing the misrepresenting. In Black’s essay, he asserts that “humbug” misrepresentations are not actually so much about trying to get someone to believe something untrue as about making yourself look better. This isn’t the “yeah I have a girlfriend, but she’s in Canada” version of looking better either, but a version of looking better where you come across as more passionate, more dedicated and more on board with your cause than anyone else. The intent is not to get someone to believe that what you are saying is literal truth, but to leave them with a certain impression about your feelings on some matter, and about you in general. In other words, there’s an inherently social component to the whole thing.

After the humbug meditations, Frankfurt moves in to the actual term bullshit and how it compares to regular old lying. The social aspect still remains, he claims, as we possibly would remain friends with a bullshitter, but not a liar. In Frankfurt’s view, a lie seeks to alter one particular fact, bullshit seeks to alter the whole landscape. A liar also has some idea about where truth is and is trying to veer away from it, but bullshit is just picking and choosing facts, half facts, and lies as they fit or don’t fit a purpose. In other words, bullshit is not necessarilyan intent to subvert truth, but an indifference to truth. He also looks at why bullshit has been proliferating: we have more chances to communicate, and more topics to communicate about. Even if our percentage of bullshit stays steady, today’s communication overload means there will be more of it, and the number of complex topics we’re confronted with encourage us to bullshit even further. The essay ends on a fairly philosophical note, concluding that bullshit proliferates the more we doubt that we can ever know the objective reality of anything.  Well then.

I liked the essay overall, as I hadn’t really thought of the social component of bullshit in these terms before. The idea that there’s some sort of philosophical underpinning to the whole endeavor is a little interesting as well. But bullshit in the regular world has been around for forever, and we mostly know how to cope with it. What happens when it moves in to academia or other “higher” sources? That’s the subject of the next essay “Deeper in to Bullshit” by GA Cohen. Cohen takes issue with Frankfurt’s focus on the intent of the talker, and wants to focus on the idea of things that are pure nonsense. In his world, it is not the lying/bluffing/indifference to truth part that is the essence of bullshit, but rather the lack of sense or “unclarifiable unclarity”. You know, the famous “if you dazzle them with brilliance, baffle them with bullshit” line of thought. Cohen also separates producers of this kind of bullshit in to two subcategories: those who aim to do this, and those who just happen to do this a lot. Fantastically, Cohen includes a little chart to clarify his version of bullshit vs Frankfurt’s:

bschart

So academia gets it’s own special brand of bullshit, but we’re not done yet. Going even further in to this topic, we get Eubanks and Schaeffer’s “A kind word for bullshit: The problem of academic writing“. Starting with the scholarly work of one Dave Barry, they point out the deep ambivalence about bullshit present in many parts of the academy. On the one hand, academics are acutely aware of the problem of bullshit and the corrosive nature of ignorance, but on the other hand, they are deeply afraid that much of what they produce may actually be bullshit. To quote Barry:

Suppose you are studying Moby-Dick. Anybody with any common sense would say that Moby-Dick is a big white whale, since the characters in the book refer to it as a big white whale roughly eleven thousand times. So in your paper, you say Moby-Dick is actually the Republic of Ireland. . . . If you can regularly come up with lunatic interpretations of simple stories, you should major in English.

This of course is especially common in the humanities and social sciences due to physics envy.

Eubanks and Schaeffer go on to split bullshitters in to two categories of their own “prototypical” bullshitters, like the original type Frankfurt described, or academic bullshit. Academic bullshit does, of course, share some qualities with prototypical bullshit, namely that it aims to enhance the reputation of the author at the expense of clear communication. They point out that this starts infecting academics while they are still students, when they have every incentive to make themselves look good to the professor, and barely any incentives to make themselves intelligible to the average person.

So with these four essays, what are my major takeaways?

  1. Bullshit must be understood in a social context. To put it on the same level as “lying”  is to miss a major motivation.
  2. Due to point #1, challenging bullshit can take tremendous effort. You not only have to challenge the lack of truth, but also might be undermining someone’s sense of self-importance. That second part tends to make the first part look like a cake walk.
  3. Academia, which should be one of our primary weapons against bullshit, has succeeded in creating their own special breeding ground for bullshit.
  4. Undoing point #3 faces all the challenges previously stated in #2, but edit the sentence like this: You not only have to challenge the lack of truth clarity, but also might be undermining someone’s sense of self-importance <insert “and career”>.
  5. I need to start using the word “humbug” more often.

The points about academics are particularly well taken, as there seems to be a common misconception that intelligence inoculates you against bullshit and self deception. When I give my talk about internet science to high school kids, it’s almost always AP classes and I have to REALLY emphasize the whole “don’t get cocky kid” point. That’s why I love showing them the motivated numeracy study I talk about here.  They are always visibly alarmed that high math ability actually makes you more prone to calculation errors if making an error will confirm a pre-existing belief you find important. As we examine bullshit and how to refute it, it’s important to note that preventing yourself from spreading bullshit is a great first step.

That does it for this week. See you next week, when we move on to “Spotting Bullshit”!

Week 2 is up! Go straight to it here.

Surveys, Privacy and the Usefulness of Lies

I’ve been thinking a lot about surveys this week (okay, I’m boring, I think a lot about them every week), but this week I have a particularly good reason. A few years ago, I wrote about a congressman named Daniel Webster and his proposal to eliminate the American Community Survey. I’ve been a little fascinated with the American Community Survey ever since, and last week I opened my mailbox to discover that we’d been selected to take it this year.

For those unfamiliar with the American Community Survey, it’s an ongoing survey by the Census Bureau that asks people lots of information about their houses, income, disability and employment status. Almost every time you see a chart that shows you “income by state” or “which county is the richest” or “places in the US with the least internet access”, the raw data came from the American Community Survey.  This obviously provides lots of good and useful information to many people and businesses, but it’s not without it’s critics. People like Congressman Webster object to the survey for reasons like government overreach, the cost and possible privacy issues with the mandatory* survey.

While I’ve written about this for years, I actually had never taken it so I was fairly excited to see what all the fuss was about. Given the scrutiny that’s been placed on the cost, I was interested to see that the initial mailing strongly encouraged me to take the survey online (using a code on the mailing) and cited all the cost savings associated with me doing so. Filling out surveys online almost certainly reduces cost, but in this day and age it also tends to increase the possible privacy issues. While the survey doesn’t ask for sensitive information like social security numbers, it does ask lots of detailed information about salary, work status, the status of your house, mortgage payments and electricity usage. I wouldn’t particularly want a hacker getting a hold of this, nor would most others I suspect.

I don’t particularly know how the Census Bureau should proceed with this survey or what Congress will decided to do, but it did get me thinking about privacy issues with online surveys and how to balance the need for data with these concerns. I work in an industry (healthcare) that is actually required by regulations to get feedback on how we’re doing and make changes accordingly, yet we also must balance privacy concerns and people who don’t want to give us information. Many people who have no problem calling you up and lecturing you about everything that went wrong while they were in the hospital absolutely freeze when you ask them to fill out a survey: they find it invasive. It’s a struggle. One of my favorite post election moments actually reflected this phenomena, in the form of a Chicago Tribune letter to the editor from a guy who said he’d never talked to a pollster in the run up to the election. His issue? He hates pollsters because they want to capture your every thought AND they never listen to people like him.  While many people like and appreciate services that reflect their perspective, are friendlier, more usable, and more tailored to their needs, many of us don’t want to be the person whose data gets taken to get there. For good reason too: our privacy is disappearing at an alarming rate, and data hacks are pretty much weekly news.

So how do survey purveyors get the trust back? One of the newest frontiers in this whole balancing act is actually coming from Silicon Valley, where tech companies are as desperate for user data as users are concerned about keeping it private. They have been advancing something called “differential privacy”, or the quest to use statistical techniques to render a data set collectively useful while rendering individual data points useless and unidentifiable. So how would this work?

My favorite of the techniques is something called “noise injection” where fake results are inserted in to the sample at a known level. For example: a survey asks you if you’ve ever committed a crime. Before you answer, you are told to flip a coin. If the coin says heads, you answer truthfully. If the coin says tails, you flip the coin again. If the coin says heads this time, you say “yes, I’ve committed a crime”. Tails, you haven’t. When the researchers go back in, they can take out the predicted fake answers and find the real number. For example, let’s say you started with 100 people. At the end of the test, you find that 35 say they committed a crime, and 65 say they haven’t. You know that 25 of those 35 should have answered “yes” due to coin flip, so you have 10 who really said “yes”. You can also subtract 25 from the 65 to get 40.

They now know the approximate real percentage of those who have committed a crime (20% in this example), but you can’t know if any individual response is true or not. This technique has possible holes in it (what if people don’t follow instructions?) and you have to cut your sample size in half, but  just asking people to admit to a crime directly with a “we promise not to share your data” actually doesn’t work so well either.  Additionally, the beauty of this technique is that it works better the larger your sample is.

Going forward we may see more efforts like this, even within the same survey or data set. While 20 years ago people may have been annoyed to fill out a section of a survey with fake data, today’s more privacy conscious consumers may be okay with it if it means their responses can’t be tied to them directly. I don’t know that the Census Bureau would ever use anything like this, but as we head towards the  2020 census, there will definitely be more talk about surveys, privacy and methodology.

*The survey is mandatory, but it appears the Census Bureau is prohibited by Congress from actually enforcing this.

Calling BS Read Along: Series Introduction

Well hello hello! A few weeks ago, someone forwarded me a syllabus for a new class at being offered at the University of Washington this semester Info 198: Calling Bullshit. The synopsis is simple: “Our world is saturated with bullshit. Learn to detect and defuse it. Obviously I was intrigued. The professors ( Carl T. Bergstrom and Jevin West) have decided to put their entire syllabus online along with links to weekly readings, and are planning to add some of  the lectures when they conclude the semester. Of course this interested me greatly, and I was excited to see that they pointed to some resources I was really familiar with, and some I wasn’t.

Given that I’m in the middle of a pretty grueling semester of my own, I thought this might be a great time to follow along with their syllabus, week by week, and post my general thoughts and observations as I went along. I’m very interested in how classes like this get thought through and executed, and what topics different people find critical in sharpening their BS detectors. Hopefully I’ll find some new resources for my own classroom talks, and see if there’s anything I’d add or subtract.

I’ll start with their introduction next week, but I’ll be following the schedule of lectures posted in the syllabus for each week:

  1. Introduction to bullshit
  2. Spotting bullshit
  3. The natural ecology of bullshit
  4. Causality
  5. Statistical traps
  6. Visualization
  7. Big data
  8. Publication bias
  9. Predatory publishing and scientific misconduct
  10. The ethics of calling bullshit
  11. Fake news
  12. Refuting bullshit

I’ll be reading through each of the readings associated with each lecture, summarizing, adding whatever random thoughts I have, and making sure the links are posted. I’ll be adding a link for the next week’s reading as well. Anyone who’s interested can of course read along and add their own commentary, or just wait for my synopsis.

Happy debunking! (And go straight to Week 1 here)

Immigration, Poverty and Gumballs Part 2: The Amazing World of Gumball

Welcome to “From the Archives”, where I dig up old posts and see what’s changed in the years since I originally wrote them.

I’ve had a rather interesting couple weeks here in my little corner of the blogosphere. A little over a year ago, a reader asked me to write a post about a video he had seen kicking around that used gumballs to illustrate world poverty. With the renewed attention to immigration issues over the last few weeks, that video apparently went viral and brought my post with it. My little blog got an avalanche of traffic and with it came a new series of questions, comments and concerns about my original post.  The comments on the original post closed after 90 days, so I was pondering if I should do another post to address some of the questions and concerns I was being sent directly. A particularly long and thoughtful comment from someone named bluecat57 convinced me that was the way to go, and almost 2500 something words later, here we are. As a friendly reminder, this is not a political blog and I am not out to change your mind on immigration to any particular stance.  I actually just like talking about how we use numbers to talk about political issues and the fallacies we may encounter there.

Note to bluecat57: A lot of this post will be based on various points you sent me in your comment, but I’m throwing a few other things in there based on things other people sent me, and I’m also heavily summarizing what you said originally. If you want me to post your original comment in the comments section (or if you want to post it yourself) so the context is preserved, I’m happy to do so.

Okay, with that out of the way, let’s take another look at things!

First, a quick summary of my original post: The original post was a review of a video by a man named Roy Beck. The video in question (watch it here) was a demonstration centered around whether or not immigration to the US could reduce world poverty. In it, pulls out a huge number of gumballs, with each one representing 1 million poor people in the world, defined by the World Bank’s cutoff of “living on  less than $2/day” and demonstrates that the number of poor people is growing faster than we could possibly curb through immigration. The video is from 2010. My criticisms of the video fell in to 3 main categories:

  1. The number of poor people was not accurate. I believe it may have been at one point, but since the video is 7 years old and world poverty has been falling rapidly, they are now wildly out of date. I don’t blame Beck for his video aging, but I do get irritated his group continues to post it with no disclaimer.
  2. That the argument the video starts with “some people say that mass immigration in to the United States can help reduce world poverty” was not a primary argument of pro-immigration groups, and that using it was a strawman.
  3. That people liked, shared and found this video more convincing than they should have because of the colorful/mathematical demonstration.

My primary reason for posting about the video at all was actually point #3, as talking about how mathematical demonstrations can be used to address various issues is a bit of a hobby of mine.  However, it was my commentary on #1 and #2 that seemed to attract most of the attention. So let’s take a look at each of my points, shall we?

Point 1: Poverty measures, and their issues: First things first: when I started writing the original post and realized I couldn’t verify Beck’s numbers, I reached out to him directly through the NumbersUSA website to ask for a source for them. I never received a response. Despite a few people finding old sources that back Beck up, I stand by the assertion that those numbers are not currently correct as he cites them. It is possible to find websites quoting those numbers from the World Bank, but as I mentioned previously, the World Bank itself does not give those numbers.  While those numbers may have come from the World Bank at some point he’s out of date by nearly a decade, and it’s a decade in which things have rapidly changed.

Now this isn’t necessarily his fault. One of the reasons Beck’s numbers were rendered inaccurate so quickly was because reducing extreme world poverty has actually been a bit of a global priority for the last few years. If you were going to make an argument about the number of people living in extreme poverty going up, 2010 was a really bad year to make that argument:

world-population-in-extreme-poverty-absolute

Link to source

Basically he made the argument in the middle of an unprecedented fall in world poverty. Again, not his fault, but it does suggest why he’s not updating the video. The argument would seem a lot weaker starting out with “there’s 700 million desperately poor people in the world and that number falls by 137,000 people every day”.

Moving on though…is the $2/day measure of poverty a valid one? Since the World Bank and Beck both agreed to use it, I didn’t question it much up front, but at the prompting of commenters, I went looking. There’s an enormously helpful breakdown of global poverty measures here, but here’s the quick version:

  1. The $2/day metric is a measure of consumption, not income and thus is very sensitive to price inflation. Consumption is used because it (attempts to) account for agrarian societies where people may grow their own food but not earn much money.
  2. Numbers are based on individual countries self-reporting. This puts some serious holes in the data.
  3. The definition is set based on what it takes to be considered poor in the poorest countries in the world. This caused it’s own problems.

That last point is important enough that the World Bank revised it’s calculation method in 2015, which explains why I couldn’t find Beck’s older numbers anywhere on the World Bank website. Prior to that, it set the benchmark for extreme poverty based off the average poverty line used by the 15 poorest countries in the world. The trouble with that measure is that someone will always be the poorest, and therefore we would never be rid of poverty. This is what is known as “relative poverty”.

Given that one of the Millennium Development Goals focused on eliminating world poverty, the World Bank decided to update it’s estimates to simply adjust for inflation. This shifts the focus to absolute poverty, or the number of people living below a single dollar amount. Neither method is perfect, but something had to be picked.

It is worth noting that country self reports can vary wildly, and asking the World Bank to put together a single number is no small task. While the numbers presented, it is worth noting that even small revisions to definitions could cause huge change. Additionally, none of these numbers address country stability, and it is quite likely that unstable countries with violent conflicts won’t report their numbers. It’s also unclear to me where charity or NGO activity is counted (likely it varies by country).

Interestingly, Politifact looked in to a few other ways of measuring global poverty and found that all of them have shown a reduction in the past 2 decades, though not as large as the World Bank’s.  Beck could change his demonstration to use a different metric, but I think the point remains that if his demonstration showed the number of poor people falling rather than rising, it would not be very compelling.

Edit/update: It’s been pointed out to me that at the 2:04 mark he changes from using the $2/day standard to “poorer than Mexico”, so it’s possible the numbers after that timepoint do actually work better than I thought they would. It’s hard to tell without him giving a firm number. For reference, it looks like in 2016 the average income in Mexico is $12,800/year . In terms of a poverty measure, the relative rank of one country against others can be really hard to pin down. If anyone has more information about the state of Mexico’s relative rank in the world, I’d be interested in hearing it.

Point 2: Is it a straw man or not? When I posted my initial piece, I mentioned right up front that I don’t debate immigration that often. Thus, when Beck started his video with “Some people say that mass immigration in to the United States can help reduce world poverty. Is that true? Well, no it’s not. And let me show you why…..” I took him very literally. His demonstration supported that first point, that’s what I focused on. When I mentioned that I didn’t think that was the primary argument being made by pro-immigration groups, I had to go to their mission pages to see what their argument actually were. None mentioned “solving world poverty” as a goal. Thus, I called Beck’s argument a straw man, as it seemed to be refuting an argument that wasn’t being made.

Unsurprisingly, I got a decent amount of pushback over this. Many people far more involved in the immigration debates than I informed me this is exactly what pro-immigration people argue, if not directly then indirectly. One of the reasons I liked bluecat57’s comment so much, is that he gave perhaps the best explanation of this.To quote directly from one message:

“The premise is false. What the pro-immigration people are arguing is that the BEST solution to poverty is to allow people to immigrate to “rich” countries. That is false. The BEST way to end poverty is by helping people get “rich” in the place of their birth.

That the “stated goals” or “arguments” of an organization do not promote immigration as a solution to poverty does NOT mean that in practice or in common belief that poverty reduction is A solution to poverty. That is why I try to always clearly define terms even if everyone THINKS they know what a term means. In general, most people use the confusion caused by lack of definition to support their positions.”

Love the last sentence in particular, and I couldn’t agree more. My “clear definitions” tag is one of my most frequently used for a reason.

In that spirit, I wanted to explain further why I saw this as a straw man, and what my actual definition of a straw man is. Merriam Webster defines a straw man as “a weak or imaginary argument or opponent that is set up to be easily defeated“. If I had ever heard someone arguing for immigration say “well we need it to solve world poverty”, I would have thought that was an incredibly weak argument, for all the reasons Beck goes in to….ie there are simply more poor people than can ever reasonably be absorbed by one (or even several) developed country.  Given this, I believe (though haven’t confirmed) that every developed/rich country places a cap on immigration at some point. Thus most of the debates I hear and am interested in are around where to place that cap in specific situations and what to do when people circumvent it. The causes of immigration requests seem mostly debated when it’s in a specific context, not a general world poverty one.

For example, here’s the three main reasons I’ve seen immigration issues hit the news in the last year:

  1. Illegal immigration from Mexico (too many mentions to link)
  2. Refugees from violent conflicts such as Syria
  3. Immigration bans from other countries

Now there are a lot of issues at play with all of these, depending on who you talk to: general immigration policy, executive power, national security, religion, international relations, the feasibility of building a border wall, the list goes on and on. Poverty and economic opportunity are heavily at play for the first one, but so is the issue of “what do we do when people circumvent existing procedures”. In all cases if someone had told me that we should provide amnesty/take in more refugees/lift a travel ban for the purpose of solving world poverty, I would have thought that was a pretty broad/weak argument that didn’t address those issues specifically enough. In other words my characterization of this video as a straw man argument was more about it’s weakness as a pro-immigration argument than a knock against the anti-immigration side. That’s why I went looking for the major pro-immigration organizations official stances….I actually couldn’t believe they would use an argument that weak. I was relieved when I didn’t see any of them advocating this point, because it’s really not a great point. (Happy to update with examples of major players using this argument if you have them, btw).

In addition to the weaknesses of this argument as a pro-immigration point, it’s worth noting that from the “cure world poverty” side it’s pretty weak as well.  I mentioned previously that huge progress has been made in reducing world poverty, and the credit for that is primarily given to individual countries boosting their GDP and reducing their internal inequality. Additionally, even given the financial situation in many countries, most people in the world don’t actually want to immigrate.  This makes sense to me. I wouldn’t move out of New England unless there was a compelling reason to. It’s home. Thus I would conclude that helping poor countries get on their feet would be a FAR more effective way of eradicating global poverty than allowing more immigration, if one had to pick between the two. It’s worth noting that there’s some debate over the effect of healthy/motivated people immigrating and sending money back to their home country (it drains the country of human capital vs it brings in 3 times more money than foreign aid), but since that wasn’t demonstrated with gumballs I’m not wading in to it.

So yeah, if someone on the pro-immigration side says mass immigration can cure world poverty, go ahead and use this video….keeping in mind of course the previously stated issue with the numbers he quotes. If they’re using a better or more country or situation specific argument though (and good glory I hope they are), then you may want to skip this one.

Now this being a video, I am mindful that Beck has little control over how it gets used and thus may not be at fault for possible straw-manning, any more than I am responsible for the people posting my post on Twitter with  Nicki Minaj gifs  (though I do love a good Nicki Minaj gif).

Point 3 The Colorful Demonstration: I stand by this point. Demonstrations with colorful balls of things are just entrancing. That’s why I’ve watched this video like 23 times:

Welp, this went on a little longer than I thought. Despite that I’m sure I missed a few things, so feel free to drop them in the comments!

Does Popularity Influence Reliability? A Discussion

Welcome to the “Papers in Meta Science” where we walk through published papers that use science to scrutinize science. At the moment we’re taking a look at the paper “Large-Scale Assessment of the Effect of Popularity on the Reliability of Research” by Pfeiffer and Hoffman. Read the introduction here, and the methods and results section here.

Well hi! Welcome back to our review of how scientific popularity influences the reliability of results. When last we left off we had established that the popularity of protein interactions did not effect the reliability of results for pairings initially, but did effect the reliability of results involving those popular proteins. In other words, you can identify the popular kids pretty well, but figuring out who they are actually connected to gets a little tricky. People like being friends with the popular kids.

Interestingly, the overall results showed a much stronger effect for the “multiple testing hypothesis” than the “inflated error effect” hypothesis, meaning that many of the false positive results seem to be coming from the extra teams running many different experiments and getting a predictable number of false positives. More overall tests = more overall false positives. This effect was 10 times stronger than the inflated error effect, though that was still present.

So what do should we do here? Well, a few things:

  1. Awareness Researchers should be extra aware that running lots of tests on a new and interesting protein could result in less accurate results.
  2. Encourage novel testing Continue to encourage people to branch out in their research as opposed to giving more funding to those researching more popular topics
  3. Informal research wikis This was an interesting idea I hadn’t seen before: use the Wikipedia model to let researchers note things they had tested that didn’t pan out. As I mentioned when I reviewed the Ioannidis paper, there’s not an easy way of knowing how many teams are working on a particular question at any given time. Setting up a less formal place for people to check what other teams were doing may give researchers better insight in to how many false positives they can expect to see.

Overall, it’s also important to remember that this is just one study and that findings in other fields may be different. It would be interesting to see a similar thing repeated in a social science type filed or something similar to see if public interest makes results better or worse.

Got another paper you’re interested in? Let me know!

The White Collar Paradox

A few weeks back I blogged about what I am now calling “The Perfect Metric Fallacy“. If you missed it, here’s the definition

The Perfect Metric Fallacy: the belief that if one simply finds the most relevant or accurate set of numbers possible, all bias will be removed, all stress will be negated, and the answer to complicated problems will become simple, clear and completely uncontroversial.”  

As I was writing that post, I realized that there was an element I wasn’t paying enough attention to. I thought about adding it in, but upon further consideration, I realized that it was big enough that it deserved it’s own post. I’m calling it “The White Collar Paradox”. Here’s my definition:

The White Collar Paradox: Requiring that numbers and statistics be used to guide all decisions due to their ability to quantify truth and overcome bias, while simultaneously only giving attention to those numbers created to cater to ones social class, spot in the workplace hierarchy, education level, or general sense of superiority.

Now of course I don’t mean to pick on just white collar folks here, though almost all offenders are white collar somehow. This could just as easily have been called the “executive paradox” or the “PhD paradox” or lots of other things. I want to be clear who this is aimed at because  plenty of white collar workers have been on the receiving end of this phenomena as well, in the form of their boss writing checks to expensive consulting firms just to have those folks tell them the same stuff their employees did only on prettier paper and using more buzzwords. Essentially, anyone who prioritizes numbers that make sense to them out of their own sense of ego despite having the education to know better is a potential perpetrator of this fallacy.

Now of course wanting to understand the problem is not a bad thing, and quite frequently busy people do not have the time to sort through endless data points. Showing your work gets you lots of credit in class, but in front of the C-suite it loses everyone’s attention in less than 10 seconds (ask me how I know this). There is a value in learning how to get your message to match the interests of your audience. However, if the audience really wants to understand the problem, sometimes they will have to get a little uncomfortable. Sometimes the problem is arising precisely because they overlooked something that’s not very understandable to them, and preferring explanations that cater to what you already know is just using numbers to pad the walls of your echo chamber.

A couple other variations I’ve seen:

  1. The novel metric preference As in “my predecessor didn’t use this metric, therefore it has value”.
  2. The trendy metric  “Prestigious institution X has promoted this metric, therefore we also need this metric”
  3. The “tell me what I want to hear” metric Otherwise known as the drunk with a lamp post…using data for support, not illumination.
  4. The emperor has no clothes metric The one that is totally unintelligible but stated with confidence and no one questions it

That last one is the easiest to compensate for. For every data set I run, I always run it by someone actually involved in the work. The number of data problems that can be spotted by almost any employee if you show them your numbers and say “hey, does this match what you see every day?” is enormous. Even if there’s no problems with your data, those employees can almost always tell you where your balance metrics should be, though normally that comes in the form of “you’re missing the point!” (again, ask me how I know this).

For anyone who runs workplace metrics, I think it’s important to note that every person in the organization is going to see the numbers differently and that’s incredibly valuable. Just like high level execs specialize in forming long term visions that day to day workers might not see, those day to day workers specialize in details the higher ups miss. Getting numbers that are reality checked by both groups isn’t easy, but your data integrity will improve dramatically and the decisions you can make will ultimately improve.

Hans Rosling and Some Updates

I’ve been a bit busy with an exam, snow shoveling and a sick kiddo this week, so I’m behind on responding to emails and a few post requests I’ve gotten. Bear with me.

I did want to mention that Hans Rosling died, which is incredibly sad. If you’ve never seen his work with statistics or his amazing presentations, please check them out. His one our “Joy of Stats” documentary is particularly recommended. For something a little shorter, try his famous “washing machine” TED talk.

I also wanted to note that due to some recent interest, I have updated my “About” page with a little bit more of the story about how I got in to statistics in the first place. I’ve mentioned a few times that I took the scenic route, so I figured I’d put the story all in one place. Click on over and find out how the accidental polymath problems began.

As an added bonus, there are also some fun illustrations from my awesome cousin Jamison, who was kind enough to make some for me.  This is my favorite pair:

gpd_true_positivegpd_false_positive

See more of his work here.

Finally, someone sent me this syllabus for a new class called “Calling Bullshit” that’s being offered at the University of Washington this semester. I started reading through it, but I’m thinking it might be more fun as a whole series. It covers some familiar ground, but they have a few topics I haven’t talked about much on this blog. I’ll likely start that up by the end of February, so keep an eye out for that.