# Calling BS Read-Along Week 5: Statistical Traps and Trickery

Welcome to the Calling Bullshit Read-Along based on the course of the same name from Carl Bergstorm and Jevin West  at the University of Washington. Each week we’ll be talking about the readings and topics they laid out in their syllabus. If you missed my intro and want the full series index, click here or if you want to go back to Week 4 click here.

Well hi there! Welcome to week 5 of the Calling Bullshit Read-Along. An interesting program note before we get started: there is now a “suitable for high school students” version of the Calling Bullshit website here. Same content, less profanity.

This week we dive in to a topic that could be its own semester long class “Statistical Traps and Trickery“. There are obviously a lot of ways of playing around with numbers to make them say what you want, so there’s not just one topic for the week. The syllabus gives a fairly long list of tricks, and the readings hit some highlights and link to some cool visualizations. One at a time these are:

Simpson’s Paradox This is a bit counterintuitive, so this visualization of the phenomena is one of the more helpful ones I’ve seen. Formally, Simpson’s paradox is when “the effect of the observed explanatory variable on the explained variable changes directions when you account for the lurking explanatory variable”. Put more simply, it is when the numbers look like there is bias in one direction, but when you control for another variable the bias goes in the other  direction. The most common real life example of this is when UC Berkeley got sued for discriminating against women in grad school admissions, only to have the numbers show they actually slightly favored women. While it was true they admitted more men than women, when you controlled for individual departments a higher proportion of women were getting in to those programs. Basically a few departments with lots of female applicants were doing most of the rejecting, and their numbers were overshadowing the other departments. If you’re still confused, check out the visual, it’s much better than words.

The Will Rogers Phenomenon I love a good pop culture reference in my statistics (see here and here), and thus have a soft spot for the Will Rogers Phenomenon.  Based on the quote “When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states”, this classic paper points to an interesting issue raised by improvements in diagnostic technology. In trying to compare outcomes for cohorts of lung cancer patients from different decades, Feinstein realized that new imaging techniques were resulting in more patients being classified as having severe disease. While these patients were actually more severe than their initial classification, they were also less severe than their new classification. In other words, the worst grade 1 patients were now the best grade 3 patients , making it look like survival rates were improving for both the grade 1 group (who lost their highest risk patients) and group 3 (who gained less severe patients). Unfortunately for all of us, none of this represented a real change in treatment, it was just numerical reshuffling.

Lead time bias Also mentioned in the reading above, this is the phenomena of “improving” survival rates simply by catching diseases earlier. For example, let’s say you were unfortunate enough to get a disease that would absolutely kill you 10 years from the day you got it. If you get diagnosed 8 years in, it looks like you survived for 2 years. If everyone panics about it and starts testing everyone for this disease, they might start catching it earlier. If improved testing now means the disease is caught at the 4 year mark instead of the 8 year mark, it will appear survival has improved by 4 years. In some cases though, this doesn’t represent a real increase in the length of survival, just an increase in the length of time you knew about it.

Case Study: Musicians and mortality This case study combines a few interesting issues, and examines a graph of musician “average age at death” which went viral.

As the case study covers, there are a few issues with this graph, most notably that it right-censors the data. Basically, musicians in newer genres die young because they still are young. While you can find Blues artists in their 80s, there are no octogenarian rappers. Without context though, this graph is fairly hard to interpret correctly. Most notably quite a few people (including the Washington Post) confused “average age at death” with “life expectancy”, which both appear on the graph but are very different things when you’re discussing a cohort that is still mostly alive. While reviewing what went wrong in this graph is interesting, the best part of this case study comes at the end where the author of the original study steps in to defend herself. She points out that she herself is the victim of a bit of a bullshit two step. In her paper and the original article, she included all the proper caveats and documented all the shortcomings of her data analysis, only to have the image go viral without any of them. At that point people stopped looking at the source and misreported things, and she rightly objects to being blamed for that. This reminds me of something someone sent me a few years ago:

Case Study: On Track Stars Cohort Effects and Not Getting Cocky In this case study, Bergstrom quite politely takes aim at one of his own graphs, and points out a time he missed a caveat for some data. He had created a graph that showed how physical performance for world record holders declines with age:

He was aware of two possible issues in the data: 1) that it represents only the world records, not how individuals vary and 2) that it only showed elite athletes. What a student pointed out to him is that there was probably a lot of sample size variation in here too.  The cohort going for the record in the 95-100 year old age group is not the same size as the cohort going for the record in the 25-30 year old age group. It’s not an overly dramatic oversight, but it does show how data issues can slip in without you even realizing it.

Well those are all the readings for the week, but there were a few other things mentioned in the list of stats tricks that I figured I’d point to my own writings on:

Base Rate Fallacy: A small percentage of a large number is often larger than a large percentage of a small number. I wrote about this in “All About that Base Rate“.

Means vs Medians: It truly surprises me how often I have to point out to people how that average might be lying to you.

Of course the godfather of all of this is How to Lie With Statistics, which should be recommended reading for every high school student in the country.

While clearly I could go on and on about this, I will stop here. See you next week when we dive in to visualizations!