Specificity

Okay, so we started off our discussion of Statistical Tricks and Treats with a general post about contingency matrices and the two ways to be wrong, and then followed up with a further look at how the base rate involved in the matrix can skew your perception of what a positive or negative test means.

In this post I’m going to take another look at the contingency matrix and define a few words you may hear associated with them. But first, let’s take another look at that matrix from last week:

Accuracy: Accuracy is the overall chance that your test is correct in either direction. So we’d have this:

Number of correct search warrants + number of innocent people left alone
Total number of tests run

Note: this SHOULD be the definition used. People will often try to wiggle out of this by saying “it’s accurate 99% of the time when drugs are present!”. They are hoping the word “accurate” distracts you from their failure to mention what happens when drugs aren’t present. This is the type of sales pitch that leads to innocent people getting arrested and cops who had no idea the were using a test that likely to be wrong.

Sensitivity: When that hypothetical sales person up there said “it’s accurate 99% of the time when drugs are present!”, what they were actually giving you was the sensitivity of the test. It (along with specificity) answers the question “How often does the test do what we want it to do?” The sensitivity is also called the true positive rate, and it’s basically this:

Correct warrants/arrests
Correct warrants/arrests + bad guys who got away with it

In other words, it’s the number of “correct” positives divided by the total number of positives. Another way of looking at it is it’s the number in the top row green box over the total number in the top row. A high percentage here means you’ve minimized false negatives.

Specificity: This is the opposite of sensitivity, and in this example it’s the one the sales person is trying not to mention. This is how accurate the test is when drugs are NOT present, aka the true negative rate. It looks like this:

Number of times you left an innocent person alone
Harassed and harassed innocent people whose trash was tested

Basically it’s the number of correctly negative tests divided by the number of total negative tests. It’s also the number in the green box in the bottom row over the total number in the bottom row. A high percentage here means you’ve minimized false positives.

Positive Predictive Value: Okay, so both sensitivity and specificity dealt with rows, and this one deals with columns. Positive predictive value is a lot of what I talked about in my base rate post: if you get a positive test, what are the chances that it’s correct?

As we covered last week, it’s this:

Correct search warrants/arrests
Correction search warrants/arrests + incorrect warrants/arrests

In other words, given that we think we’ve found drugs, what are the chances that we actually have? It’s the green box in the first column over the total number in the first column. This is the one the base rate can mess with BIG time. You see, when companies that develop tests put them on the market, they can’t know what type of population you’re going to use them on. This value is unknown until you start to use it. A high value here means you’ve minimized false positives.

Negative Predictive Value: The flip side of the positive predictive value, this is about the second column. Given that we got a negative test, what are the chances there are no drugs? This is:

Innocent people who go unbothered
Innocent people who go unbothered + bad guys who get away with it

So for the second column, the number in the green box over the total second column. A high value here means you’ve minimized false negatives.

So to recap:

Sensitivity and Specificity:

Answer the question “how does the test perform when drug are/are not present”
Refer to the rows (at least in this table set up)
High sensitivity = low false negatives, low sensitivity = lots of false negatives
High specificity = low false positives, low specificity = lots of false positives
Information about how “accurate” one of the values is does not give the whole picture

Positive and Negative Predictive value (PPV and NPV):

Answer the question “Given a positive/negative test result, what are the chances drugs are/are not actually present?”
Refer to columns (at least in this table set up)
High PPV = low false positives, low PPV = high false positives
High NPV = low false negatives, low NPV = high false negatives
Can be heavily influenced by the rate of the underlying condition (in this case drug use) in the population being tested (base rate)

Would you believe in a love at first sight? Yes I’m certain that it happens all the time.

-John Lennon and Paul McCartney (cowriters)

This week’s question comes from a little known group called “The Beatles”. It’s from their song “With a Little Help From My Friends“¹, and the sentiment was raised by my friend John when we were discussing relationships. Now John’s a little bit of a ~~hopeless romantic~~ pragmatic idealist, so the idea of love at first sight kind of appeals to him. But does it exist? And more importantly, does it really happen all the time? Let’s take a look!

Alright, let’s be honest here…the question of whether or not you can really fall in love at first sight is one typically addressed by philosophical debates, not statisticians. Literally everyone has an opinion on this, and often a strong one. It’s a question that inspires all sorts of crazy debates, tons of movies, countless songs, and a mildly disturbing yet rather watchable reality show. I’m not a philosopher and I’m not getting in to all of that “what is love” junk², but I can tell you in the dating market it’s kind of a guy thing. In a user survey done by Match.com, they found that about 60% of men believed in it, and 40% said it had happened to them. For women, those numbers were about 50% and 30%, respectively. Those numbers would suggest that John and Paul were on to something, as it certainly seems to be a pretty common occurrence. But is that the whole story?

What jumps out at me as a I pondered this question was a concept known as the toupee fallacy. This seems to be one of those questions where the facts we’re not seeing might be as important as the ones we are seeing. I’m concerned that there’s some silent evidence at play here, and we may be missing a few things. Namely, we’re not seeing how often people think they’ve fallen in love at first sight, only to be quickly disappointed. Whatever this feeling or moment we are talking about is only gets counted if it works. Here, let me illustrate:

It all looks so easy, doesn’t it?

So pretty much everyone we meet falls in one of those 4 boxes. When we talk about love at first sight though, we often only talk about it in the context of those two red boxes, ie people who wind up together. What we can’t forget about is that blue box there…those we meet, feel an instant attraction to that never pans out. Here’s the same information put another way:

Possible Stalker/Type 1 error is my new band name.

Now what we’re generally going for in life is either the box in red or the box in black. In stats terms the red ones are true positives (falling in love with someone who loves you) and the black are true negatives (not falling in love with someone who doesn’t love you). The other two boxes are actually what we’d call Type 1 errors and Type 2 errors….ie, the chance that we make the wrong call initially. If we presume the null hypothesis is that most people don’t love us³, we can call the box in blue, our type 1 error and the box in green our type 2 error. In love, we almost always prefer Type 2 errors….in other words, we want to find out we loved someone when we didn’t realize it rather than fall in love with someone who doesn’t like us.

But what influences the number of people who fall in each box? Well, for that we have to take a look at the words that make up both of our conditions.

Let’s start with “end up together”.

During the discussion that prompted this question, John and I were specifically chatting about people who end up married. Now, in 2015, this may not be a great metric to go by. Many people who are in love do not get married, date or cohabitate for much longer than past generations, or otherwise define their loving relationships differently. The point is not to cover every possible scenario, but rather to remind people that the more narrowly you define “end up together” the less likely it is to happen from a strictly statistical point of view. For example, in the numbers I gave in the beginning of the post, 40% of men said they had fallen in love at first sight…but these were men participating in a survey for singles on a dating website. Of course some of those men could have been widows, but the rest of them either ended the relationship that started with love at first sight, or had it ended for them. Does this count? Some will say yes, others will say no. Your standards will influence how many people are covered by that first row.

Now, “didn’t end up together” seems more straightforward, but it actually can also cover a range of scenarios. I made a joke in my table about someone who falls in love with someone who doesn’t love them being a stalker, but that’s not the whole story. Most of us have met someone who we thought was awesome….for 5 seconds until they opened their mouths. Or until part way through the first date. Or two weeks later when you saw their massive teddy bear collection. You get the picture. The point is, not ending up with someone can mean a whole lot of things from “they were taken” to “we decided we were better as friends”. How broadly you define this will also determine how often people fall in this category.

Love At First Sight (LAFS)

Alright, lets move on to love at first sight. How are we defining this and how often is it happening? Well, this one can get interesting. LAFS is one of those things people tend to define by saying things like “if you have to ask, it didn’t happen”. You know it when you see it. This makes it ripe for hijinks and chicanery, which I’ll get in to in a minute. In it’s most basic sense though, everyone seems to agree it’s some sort of overwhelming feeling of attraction bordering on feeling magnetically pulled towards a person. How broadly you define this, and how often you think this has happened to you already are going to effect the number of people in that box.

So now that we’ve got some definitions, let’s put some generally fictitious numbers in those boxes. Let’s say you’ve met about 1000 people in the generally correct age/gender/orientation that you’re attracted to. Here’s what happened with them. You’ve dated about 20, and twice think you felt something that could have been LAFS. One of those worked out, one didn’t. Your percentages are here. We get these numbers:

There’s a Taylor Swift song somewhere in here.

So unfortunately, the chances are kind of small. You can run the numbers for your own life, but my guess is it will be pretty small there too.

But John and Paul promised me! You said they were on to something!

Okay, you got me. So what’s going on here?

Well, the answer is really that we don’t actually often think of this in the terms I put above. We’re not evaluating our own lives and our own chances, we’re trying to go off of other people’s experiences. We are not calculating overall probabilities like I did above, we’re doing conditional probabilities. No one asks people what happened when they didn’t find love, we ask them what happens when they did find love. In stats this is a huge difference. We just went from a regular probability to a conditional probability. Basically, it’s the difference between these two equations:

P here is “probability” and the rest is about how I’m totally not bitter.

That first equation gives us a .01% chance, and the second one gives us a 5% chance, using the numbers above. That’s 500 times higher!

And this is assuming everyone’s being honest about who they’re putting in what box. Spoiler alert: they’re not.

Most of this isn’t intentional though. It’s just that as humans, we don’t tend to remember all the details of good events. In fact, our memories of bad events are much stronger and typically more detailed. So when two people fall in love and things work out, they will likely not really remember the moments of doubt or insecurity that may have actually been present in the beginning of their relationship. They will retell the story more amusingly and more positively than the actual events may have warranted. This is so prevalent in fact that it is actually considered a hallmark of a healthy relationship. We can infer then that by only talking to people in happy relationships, we may actually be overestimating how many people met and “just knew”⁴. That’s why research on this is so sparse….the data confounds itself.

Ugh, well that’s not great news.

No, and it gets worse. When John and Paul claimed that this happened all the time, they were likely right….but that won’t help you. For example, let’s say that 1 out of 1000 people every year are likely to experience un-exaggerated, for real, LAFS with someone they stay with. That’s about 25,000 people a year in the USA. That’s 67 a day. You will almost certainly know some of these people….but they may not ever be you. Bummer.

Got any more good news?

Well yeah, actually, I do! See, the thing is, LAFS may not even be the ideal here. There’s actually some interesting evidence that people who date for longer stay married longer⁵. Apparently it’s long engagements that threaten marital stability, not long dating periods. So while those in the LAFS/stay together box may get a lot of attention, the ones in the no LAFS/stay together box may be quietly outdoing them. Also, when finding true love, most people really are more interested in the exponential distribution, not the Poisson distribution⁶. In other words, we’re not so concerned about the number of events, but rather how long we have to wait for it! Once you find the one, you probably won’t care so much how it happens, and evidence suggests that you and your beloved will keep altering your story bit by bit until it’s worthy of it’s own movie with the attractive Hollywood folks of your choice. You’ll get there. May your W = time to first event be short, and your moment generating function be beautiful.

^{1. Weird fact I learned about this song while researching this post: the first line was originally “what would you do if I sang out of tune, would you throw ripe tomatoes at me?” but Ringo made them change it when he realized their rabid fans might take it seriously. ↩}
^{2. Baby don’t hurt me.↩}
^{3. Okay, emo kid.↩}
^{4. If you ever want to see this in action, find a friend who you knew pre and post divorce. If you know the story of how they met their ex, it’s really interesting to ask them again after their divorce. It is almost guaranteed the story will have changed, gotten briefer or otherwise be a bit altered. Do NOT point this out to them. Don’t ask me how I know this.↩}
^{5. Some of this data is kinda old…marriage and dating practices have changed rapidly over the last few decades. Caveat emptor.↩}
^{6. Our love is anything but a normal distribution!↩}

graph paper diaries

because some of us need a few more lines to keep everything straight

Predictions and Accuracy: Some Technical Terms

No One Asked Me: Love at First Sight

Share this:

Share this: