John Napier’s Cockerel

In my post from Sunday, I talked about base rates and how police investigative techniques can go wrong. I specifically focused on testing methods for drug residue, which are not always as accurate as you might hope.

On an interestingly related note, today I was reading the chapter on logarithms from “In Pursuit of the Unknown: 17 Equations That Changed the World” (one of my math books I’m reading this year).  It was discussing John Napier, a Scottish mathematician who invented logarithms in the early 1600s.  Napier was an interesting guy….friend of Tycho Brahe, brilliant mathematician, and possible believer in the occult. For reasons possibly having to do with one of  those last two, he apparently carried a black cockerel (rooster) around with him a lot.

It’s actually not clear if he really was involved in the occult, but he did tell everyone he had a magic rooster. He used it to catch thieves.  Here’s what his strategy was:

JohnNapier

No idea what the base rate was here, or if it would hold up in court today….but maybe something to consider if the budget gets cut.

(Special thanks to the Shakespeare translator for helping me out a bit on this one).

All About that Base Rate

Of all the statistical tricks or treats I like to think about, the base rate (and it’s associated fallacy) are probably the most interesting to me. It’s a common fallacy, in large part because it requires two steps of math to work out what’s going on. I’ve referenced it before, but I wanted a definitive post where I walked through what a base rate is and why you should remember it exists. Ready? Let’s go.

First, let’s find an example.
Like most math problems, this one will be a little easier to follow if we use an example.In my

In my Intro to Internet Science series, I mentioned the troubling case of a couple of former CIA analysts whose house was raided by a SWAT team after they were spotted shopping at the wrong garden store. After spotting the couple purchasing what they thought was marijuana growing equipment, the police had tested their trashcans for the presence of drugs. Twice the police got a positive test result, and thus felt perfectly comfortable raiding the house and holding the parents and kids at gunpoint for two hours while they searched for the major marijuana growing operation they believed they were running. In the end it was determined the couple was actually totally innocent. There’s a lot going on with this story legally, but what was up with those positive drug tests?

Let’s make a contingency table!
In last week’s post, I discussed the fact that there is almost always more than one way to be wrong. A contingency table helps us visualize the various possibilities that can arise from the two different types of test results and the two different realities:

Drugsearch

So here we have four options, two good and two bad:

  1. True positive (yes/yes): we have evidence of actual wrongdoing1
  2. False negative (no/yes): someone with drugs appears innocent
  3. False positive (yes/no): someone without drugs appears guilty
  4. True negative (no/no): an innocent person’s innocence is confirmed

In this case, we ended up with a false positive, but how often does that really happen? Is this just an aberration or something we should be concerned about?

Picking between the lesser of two evils.
Before we go on, let’s take a step back for a minute and consider why the police department may have had to consider when they selected a drug screening test to use. It’s important to recognize that in this situation (as in most of life), you actually do have some discretion over which way you chose to be wrong.  In a perfect world we’d have unlimited resources to buy a test that gets the right answer every time, but in the real world we often have to go the cheap route and consider the consequences of either type of error and make trade-offs.

For example, in medicine false positives are almost always preferable to false negatives. Most doctors (and patients!) would prefer that a screening test told them they might have a disease that they did not have (false positive) than to have a screening test miss a disease they did have (false negative).

In criminal justice, there is a similar preference. Police would rather have evidence of activity that didn’t happen (false positive) then not get evidence when a crime was committed (false negative).

So what kind of trade-offs are we talking about?
Well, in the article I linked to above, it mentioned that one of the downfalls of the drug tests many police departments use is a very high false positive rate…..as high as 70%. This means that if you tested 100 trashcans that were completely free of drugs, you’d get a positive test for 70 of them.

Well that sounds pretty bad….so is that the base rate you were talking about?
No, but it is an important rate to keep in mind because it influences the math in ways that aren’t particularly intuitive for most people. For example, if we test 1000 trash cans, half with drugs and half without, here’s what we get:
Drugsearch2

When the police are out in the field, they get exactly one piece of information: whether or not the trash can tested positive for drugs.  In order to use this information, we actually have to calculate what that means. In the above example, we have 495 true positive trash cans with drugs in them. We also have 350 false positive trash cans with no drugs in them, but with a positive test. So overall, we have 845 trash cans with a positive test. 495/845 is about 59%…..so under these circumstances, a positive test only means drugs are present about 60% of the time.

Now about that base rate……
Okay, so none of that is great, but this actually can get worse. You see, the rate of those who do drugs and those who don’t do drugs isn’t actually equal. The rate of those who don’t do drugs is actually much much higher, and this is the base rate I was talking about before.

According to many reports, about 10% of the US adult population used illegal drugs in the past month (mostly marijuana, FYI….not controlled for states that have legalized it). Presumably this means that about 10% of trash cans might contain drugs at any given time. That makes our numbers look like this:

drugsearch3

Using the same math as above, we get 99/(630+99) = 14%. Now we realize that for every positive test, there’s actually only about a 14% chance there are drugs in that trash can. I’m somewhat curious how much worse that is than just having a trained police officer take a look.  In fact, because the base rates are so different, you actually would need a test with an 11% false positive rate (as compared to the 70% we currently have) to make the chances 50/50 that your test is telling you what you think it’s telling you. Yikes.

Now of course these numbers only holds if you’re testing trash cans randomly….but if you’re testing the garbage of everyone who goes to a garden store on a Saturday morning, that may be a little closer to the truth than you want to admit.

So what’s the takeaway?
The crux of the base rate fallacy is that a small percentage of a large number can easily be larger than a large percentage of a small number. This is basic math, but it becomes hard to remember when you’re in the moment and the information is not being presented in a straightforward way. If you got a math test that said “Which value is larger….11% of 900 or 99% of 100?” You’d probably get it right pretty quickly. However, when it’s up to you to remember what the base rate is, people get much much worse at this problem. In fact, the vast majority of medical doctors don’t get this type of problem correct when it’s presented to them and they’re specifically given the base rate….so my guess is the general population success rate is quite low.

No matter how accurate a test is, if the total number of entries in one of the rows (or columns) is much larger than the total of the other, you should watch out for this.

Base rate matters.
1. Note for the libertarians: It is beyond the scope of this post to discuss current drug policy and whether or not this should actually constitute wrongdoing. Just roll with it.

Three Ways to Be Wrong in Narnia

After my last post on the two different ways of being wrong, the Assistant Village Idiot brought up the dwarves from the book “The Last Battle” from the Chronicles of Narnia series. I was curious what the contingency matrix for that book would look like. I haven’t read it in a while, but I quickly realized there were actually three pretty distinct ways of being wrong in that book. As far as I can tell, the matrix looks like this:

2by2narnia

You’re welcome.

Two Ways To Be Wrong

One of the most interesting things I’ve gotten to do since I started blogging about data/stats/science is to go to high school classrooms and share some of what I’ve learned. I started with my brother’s Environmental Science class a few years ago, and that has expanded to include other classes at his school and some other classes elsewhere. I often get more out of these talks than the kids do…something about the questions and immediate feedback really pushes me to think about how I present things.

Given that, I was intrigued by a call I got from my brother yesterday. We were talking a bit about science and skepticism, and he mentioned that as the year wound down he was having to walk back on some of what I presented to his class at the beginning of the year. The problem, he said, was not that the kids had failed to grasp the message of skepticism…but rather that they had grasped it too well. He had spent the year attempting to get kids to think critically, and was now hearing his kids essentially claim it was impossible to know anything because everything could be manipulated.

Oops.

I was thinking about this after we hung up, and how important it is not to leave the impression that there’s only one way to be wrong.  In most situations that need a judgment call, there’s actually two ways to be wrong.  Stats and medicine have a really interesting tool for showing this phenomena: a 2×2 contingency matrix . Basically, you take two different conditions and sort how often they agree or disagree and under what circumstances those happen.

For example, for my brother’s class, this is the contingency matrix:

Skepticalgullible

In terms of outcomes,  we have 4 options:

  1. True Positive:  Believing a true idea (brilliant early adopter).
  2. False Negative (Type II error): Not believing a true idea (in denial/impeding progress).
  3. False Positive (Type I error): Believing a false idea (gullible rube)
  4. True Negative: Not believing a false idea (appropriately skeptical)

Of those four options, #2 and #3 are the two we want to avoid. In those cases the reality (true or not) clashes with the test (in this case our assessment of the truth).  In my talk and my brother’s later lessons, we focused on eliminating #3. One way of doing this is to be more discerning with what we believe or we don’t, but many people can leave with the impression that disbelieving everything is the way to go. While that will absolutely reduce the number of false positive beliefs, it will also increase the number of false negatives. Now, depending on the field this may not be a bad thing, but overall it’s just substituting one lack of thought for another. What’s trickier is to stay open to evidence while also being skeptical.

It’s probably worth mentioning that not everyone gets into these categories honestly…some people believe a true thing pretty much by accident or fail to believe a false thing for bad reasons. Every field has an example of someone who accidentally ended up on the right side of history. There also aren’t always just two possibilities, many scientific theories have shades of gray.

Caveats aside, it’s important to at least raise the possibility that not all errors are the same. Most of us have a bias towards one error or another, and will exhort others to avoid one at the expense of the other. However, for both our own sense of humility and the full education of others, it’s probably worth keeping an eye on the other way of being wrong.

Lost in Translation: Survey Edition

I ran across an interesting article from Quartz today that serves as an interesting warning for those attempting to compare cross-cultural survey results.

People from multiple countries were asked the same question “Would you personally accept a refugee into your own home?”, and the results were compared to find the “most welcoming” country.  China came out ahead by a large margin: 46% of residents said yes, as compared to 15% of US residents.

However, when the question was more closely examined, it was discovered that the English word “refugee” does not have an exact translation in Chinese. While in the US “refugee” almost always refers to someone from another country, in Chinese the word has a more neutral “person who has experienced a calamity” definition. Depending on the situation, it is then modified with either “domestic” or “international”.  The survey question did not contain either modifier, so it was up to the respondent’s personal interpretation.

So basically, people in different countries were answering different questions and then the results were compared. Surveys are already prone to lots of bias, and adding inexact translations into the mix can obviously heighten that effect. Interesting thing to be aware of when reading any research that compares international responses.

5 Things You Should Know About Medical Errors and Mortality

Medical Errors are No. 3 Cause of US Deaths“.  As someone who has spent her entire career working in hospitals, I was interested to see this headline a few weeks ago. I was intrigued by the data, but a little skeptical. Not only have I seen a lot of patient deaths, but it seems relatively rare in my day-to-day life that I see someone reference a death by medical error.  However, according to Makary et al in the BMJ this month, it happens over 250,000 times a year.

Since the report came out, two of my favorite websites (Science Based Medicine and Health News Review ) have come out with some critiques of the study. The pieces are both excellent and long, so I thought I’d go over some highlights:

  1. This study is actually a review, combined with some mathematical modeling. Though reported as a study in the press, this was actually an extrapolation based off of 4 earlier studies from 1999, 2002, 2004 and 2010. I don’t have access to the full paper, but according to the Skeptical Scalpel, the underlying papers found 35 preventable deaths. It’s that number that got extrapolated out to 250,000.
  2. No one needs to have made an error for something to be called an error. When you hear the word “error” you typically think of someone needing to do “x” but instead doing “y” or doing nothing at all. All 4 studies used in the Makary analysis had a different definition of “error”, and it wasn’t always that straightforward and required a lot of judgment calls to classify. Errors were essentially defined as “preventable adverse events”, even in cases where no one could say how you would have prevented it. For example, in one study serious post-surgical hemorrhaging was  always considered an error, even when there was no error identified. Essentially some conditions were assumed to ALWAYS be caused by an error, even if they were a known risk of the procedure. That definition wasn’t even the most liberal one used by the way….at least one of the studies called ALL “adverse events” during care preventable. That’s pretty broad.
  3. Some of the samples were skewed. The largest paper included actually looked exclusively at Medicare recipients (aka those over 65), and at least according to the Science Based Medicine review, it doesn’t seem they controlled for the age issue when extrapolating for the country as a whole. The numbers ultimately suggest that 1/3 of all deaths occurring in a hospital are due to error…..which seems a bit high.
  4. Prior health status isn’t known or reported. One of the primary complaints of the authors of the study is that “medical error” isn’t counted in official cause of death statistics, only the underlying condition. This means that someone seeking treatment for cancer they weren’t otherwise going to die from who dies of a medical error gets counted as a cancer death. On the other hand, this means that someone who was about to die of cancer but also has a medical error gets counted as a cancer death. Since sick people receive far more treatment, we do know most of these errors are happening to already sick people. Really the ideal metric here would be “years of life lost” to help control for people who were severely ill prior to the error.
  5. Over-reporting of medical errors isn’t entirely benign. A significant amount of my job is focused on improving the quality of what we do. I am always grateful when people point out that errors happen in medicine, and draw attention to the problem. On the other hand, there is some concern that stories like this could leave your average person with the impression that avoiding hospitals is safer than actually seeking care. This isn’t true. One of the reasons we have so many medical errors in this country is because medicine can actually do a lot for you. It’s not perfect by any means, but the more options we have and the longer we keep people alive using medicine, the more likely it is that someone administering that care is going to screw up. In many cases, delaying or avoiding care will kill you a heck of a lot faster even the most egregiously sloppy health care provider.

Again, none of this is to say that errors aren’t a big deal. No matter how you define them, we should always be working to reduce them. However, as with all data, it’s good to know exactly what we’re looking at here.

Statistical Tricks and Treats

Well hi!

After 8 fantastic weeks of working with Ben on the good, the bad, and the ugly of Pop Science, it’s time to move on to a new Sunday series.

When I give my talk to high school students, one of my biggest struggles is really not having time to cover any math. That’s  what I really love, but it’s pretty much impossible in a short time frame and when I’ve tried I always feel like I do kids a disservice. I mentioned some of this struggle in Part 7 of my Internet Science series, and I realized I probably have enough to say about this that I can make a whole series about it.

A few likely posts you’ll be seeing:

  1. That sneaky average
  2. Base rates and other shenanigans
  3. Independence and Probability
  4. To replicate or not to replicate
  5. Correlation and causation

If you’d like to see anything else, let me know!

What I’m Reading: May 2016

My brother sent me this article about a guy who is using data anomalies to track down Medicare fraud. Interesting use of patterns, data, and humans to go where the government can’t.

Things are getting meta: a new study looks at how much people trust scientists who do science blogging.

I’ve seen a few interesting comments recently on various metrics being influenced by shifting demographics. This one from the Economist covers household income stats, and how they may not always be as straightforward as they appear.

As a math person, I’m supposed to be outraged by this story about a flight that got delayed because a professor was scribbling equations and it freaked his seatmate out. I don’t know though….our TSA tagline is “if you see something, say something”. That’s just asking for false positives people, why are we surprised?

For those in the USA wondering what the heck happened with our primary system this year, I liked this explanation about how hard it is to get a system to reflect the will of the people.

My book of the month is What’s a p-value anyway? 34 Stories to Help You Actually Understand Statistics. This one is definitely going on my list of books to recommend for high school or college students trying to pass a Stats 101 class.

 

Ten Science Songs So Confusing They’re Not Even Wrong (Part 2)

Well hi there! Welcome to Part 2 of Ten Science So Confusing They’re Not Even Wrong, where we cover songs with science references so perplexing they can’t quite be classified.  If you missed part 1, you can find it here.

“Cosmic Thing” by The B-52s,
Nominated Line: whole song

Bethany: In the long and grand tradition of songs that just yell random words that are vaguely scienc-ey, comes a cosmic song with a wonderful chorus: COSMIC! COSMIC! WOOO COSMIC! IONOSPHERE! SHAKE YOUR HONEY BUNS!

I’m beginning to think science education may not be the purpose of this song.

Ben: The B-52s have always had a lyrical style that can best be described as “Kubla Khan, but written by a UCLA freshman taking an improv class while high on cheap ecstasy.” It’s simultaneously both unremarkable and unforgettable, and if this band hadn’t existed, I wouldn’t find myself singing “Everybody HAD! Matching TOWELS!” aloud at random and inappropriate intervals in my life.

I’m reluctant to spend much time on this song, because I’m worried that doing so will cause it to move permanently into a section of my brain, probably evicting something more important on its way in. By time this post is finished, I’ll have no recollection of the Webster-Ashburton treaty, but I will spend the rest of the month hum-shouting “don’t let it rest on the President’s desk!” Away with you, Fred Schneider! Haunt this cranium no more!

“Friction” by Echo & The Bunnymen
Nominated Line: whole song

Bethany: Friction! Hey friction! This song cites friction so much I was really excited to see what kind of physics problem they were going to throw me. There was a reference to telescopes, and I kind of thought I knew where things were going, and then we got to this line “If I ever catch that ventriloquist/I’ll squeeze his head right into my fist.”

Well then.

So the references to friction pick back up again with “stop this head motion”….then dies again with “Set the sails/You know all us boys gonna wind up in jail.” This test just got dark.

Ben: I have no beef with Echo and the Bunnymen, who I have always considered sort of the sonic equivalent of The Cure trying to create their own version of R.E.M.’s Monster (which, frankly, is sort of my jam). But there’s usually an unfascinating ambivalence to their lyrics, and it leaves the listener shrugging and going, “well, I guess it’s about something.”

I don’t know if the ventriloquist line is a metaphor, but I very much hope not. If this song was about the emotional pain that Ian McCulloch went through as a result of a dickish puppeteer upon whom he has vowed revenge, then I’m a million percent* back in on this song.

*since this is a science site, I should note that it is not actually possible to be a million percent into anything. I think.

Bethany: Wait, was this song on the Being John Malkovich sound track? I may have to rethink my review.

“What’s My Name” by Rihanna feat. Drake
Nominated Line: The square root of 69 is 8 something, cuz I’ve been trying to work it out.

Bethany: You know you’re a math geek when you hear a line like this and actually wonder why he stopped at 8 something, when there was so much more to say. Like 8.3066…..and that’s not the point here is it? No one was really going for math here were they? Well this is awkward.

Ben: It could be worse, Bethany. I had to go to Yahoo Answers in order to look up the joke. “Oh, eight as in… oh, I get it now.” Just humiliating all the way round.

It’s also a very Drake thing to throw in a bad math joke when appearing on a Rihanna song  – everything about his appearance shouts “I’m out of my league here and I know it.” Why else would he be wearing a UMass sweatshirt?

Bethany: Yeah, let’s just pretend this whole thing never happened.

“The Bad Touch” by Bloodhound Gang
Nominated Line: “Let’s do it like they do on the Discovery Channel.”

Bethany: Hey! It’s another song that’s not really wrong but makes me a little unsettled. I thought it would get better if I watched the video, but it actually got worse. The monkey costumes were actually used pretty well, which is part of the problem. Their impression of wild animals is just a little too good.

Ben: This is another song that brings up embarrassment, as I once had a pastor who emailed me to ask if I could put together “The Bloodhound Gang song” for a sermon she was doing. We had a good 36 hours of confusing, argumentative emails until I discovered that what she was actually referring to was this. We had very different cultural touchstones growing up.

This song arrived right about at the ideal point in my adolescence, as it was released during the summer I was 15 years old and working my first job at McDonald’s. I don’t think I’ve heard this song in at least a decade, but I bet if you plugged it into a karaoke machine and handed me a microphone, I could fly through the now-exceptionally-dated lyrics without barely a hiccup. “Yes I’m Siskel, yes I’m Ebert, and you’re getting two thumbs up!” The Wikipedia for this song says that it was remixed by both God Lives Underwater and Eiffel 65, which is a very turn-of-the-millenium piece of information.

My memory’s been abruptly jogged by writing this section: I did a post on this several years ago, during my “Hunt For The Most 90’s Song Of All Time!” As I recall, it scored moderately but not exceptionally highly. (I never finished the hunt, but it was clear early on that I wasn’t going to find a song more qualified than The Spice Girls’ “Wannabe.”)

Bethany: Okay, I was wondering if I should confess I know every word to this song, and you talked me in to it. I also still know all the words to “The Real Slim Shady”. That is definitely the reason I forget where I put my keys every morning.

Ben: If Smashmouth didn’t exist, I would still know trigonometry.

“Make her Say” by Kid Cudi feat Kanye West
Nominated Line: When You Used Your Medulla Oblongata And Give Me Scoliosis Until I Comatosest And Do While I’m Sleep, Yeah A Lil Osmosis

Ben: I’m ahead of Bethany again! And once again, it seems wise of me to step back and give her the lead. I’m an expert on all things Kanye, but not the central nervous system (the space for that information was taken up by the lyrics to “Love Shack”), and I better stick to my lane here.

Bethany: I’m going to start this in reverse order here. There’s an old joke in science major circles about “falling asleep on your textbook and learning by osmosis”. The proper response to this of course is “that assumes knowledge works like water, and you’re clearly not passing”. Science majors can be cruel.

Back to the beginning though. The medulla oblongata is a really important part of the brain, responsible for all sorts of nice things like breathing, swallowing, sneezing and reflexes.  I was going to give Kanye some credit here, because apparently there’s a condition called syringomyelia where the bone near the medulla oblogata has lesions and can eventually result in scoliosis….but then I realized he said that you use your medulla oblongata and give him scoliosis. I think there’s a blowjob joke in here somewhere, but frankly I’m not looking any further in to it.   Oh, and comatosest isn’t a word. You’re welcome.

Ben: We keep stumbling into accidental cunnilingus references, and that’s really not what we set out to do here (though, to be fair, I can only speak for me). A little research digs up the medulla oblongata controls a number of involuntary actions, like the um, gag reflex and, uh, swallowing, and um, I guess the point is that this might not just be blather that Ye is spitting here (pun not intended).

Basically, the more you look at this, the less it seems like a fun verse with a TI reference and you start to get focused on the fact that they sampled Lady GaGa’s “Poker Face” so that it became “poke her face” and… you know what, let’s just drop the mic and move on.

Bethany: Yeah, this is getting awkward. Science, you’re drunk. Go home.

Missed the rest of the series? Find it here!