Bitterness and Psychopathy

I’m having some insomnia problems at the moment, so it was about 4am today when I turned on my coffee maker and sat down to do some internet perusing. I was just taking my first sip, when I stumbled upon this article titled “People Who Take Their Coffee Black Have Psychopathic Tendencies“.

Oh. Huh.

As a fairly dedicated black coffee drinker, I had to take a look.  The article references a study here that tested the hypothesis that an affinity for bitter flavors might be associated with the “Dark Tetrad” traits: Machiavellianism, psychopathy, every day sadism and narcissim.

I read through the study1, and I thought it was a good time to talk about effect sizes. First, lets cover a few basics:

  1. This was a study done through Mechanical Turk
  2. People took personality tests and rated how much they liked different foods, the researchers ran some regressions and reported the correlations  for these results
  3. They did some other interesting stuff to make sure people really liked the bitter versions of the foods they were rating and to make sure their results were valid

Alright, so what did they find? Well, there was a correlation between preference for bitter tastes and some of the “Dark Tetrad” scores, especially everday sadism2. The researchers pretty much did what they wanted to do, and they found statistically significant correlations.

So what’s my issue?

My issue is we need to talk about effect sizes, especially as this research gets repeated. The correlation between mean bitter taste preference and the “Dark Tetrad” scores over the two studies ranged from .14 to .20.  Now that’s a significant finding in terms of the hypothesis, but if you’re trying to figure out if a black coffee drinker you love might be a psychopath3? Not so useful.

See, an r of .14 translates in to an R2 of about .02. Put in stats terms, that means that 2% of the variation in psychopathy score can be explained by4 variation in the preference for bitter foods or beverages. The other 98% is based on things outside the scope of this study. For r = .2, that goes up to 4% explained, 96% unexplained.

Additionally, it should be made clear that no one bitter taste was associated with these traits, only the overall score on ALL bitter foods was.  So if you like coffee black, but have an issue with tonic water or celery, you’re fine.

The researchers didn’t include the full list of foods, but I was surprised to note that they included beer as one of the bitter options. Especially when looking at antisocial tendencies, it seems potentially confounding to include a highly mood altering beverage alongside foods like grapefruit. I’d be interested in seeing the numbers rerun with beer excluded.

 

1. And no, I didn’t add cream to my coffee. Fear me.
2. It’s worth noting that the mean score for this trait was lower than any other trait however…1.77 out of 5. It’s plausible that only the bottom of the range was tested.
3. Hi honey!
4. In the mathematical sense that is, this does not prove causation by itself

Popular Opinion

A few years ago, there was a brief moment in the NFL where all anyone could talk about was Tim Tebow.  Tebow was a controversial figure who I didn’t have much of an opinion on, but he sparked a comment from Chuck Klosterman (I think) that changed the way I think about political discussions.  I’ve never been able to track the exact quote down, but it was something like “half the country loves him, half the country hates him, but both sides think they’re an oppressed minority.” Now I don’t know if that was really true with Tebow, but I think about it every time someone says “what no one is talking about…” or “the conventional wisdom is….” or even just a basic “most people think is….” I always get curious about how we know this.  It’s not unusual that I’ll hear someone I know and love assert that the media never talks about something I think the media constantly talks about. It’s a perplexing issue.

Anyway, that interest is why I was super excite by this Washington Post puzzle that showed how easily our opinions about what others think can be skewed even if we’re not engaging in selection bias.  It also illustrates two things well: 1) why the opinion of well known people can be important and 2) why a well known person advocating for something does not automatically mean that issue is “settled”.

Good things to consider the next time I find myself claiming that “no one realizes” or “everyone thinks that”.

Guns and Graphs

One of my favorite stats-esque topics is graphs. Specifically how we misrepresent with graphs, or how we can present data better.  This weeks gun control debate provided a lot of good examples of how we present these things….starting with this article at Slate States With Tighter Gun Control Laws Have Fewer Gun Deaths.  It came with this graph:

Gun graph 1

Now my first thought when looking at this graph was two-fold:

  1. FANTASTIC use of color
  2. That’s one heck of a correlation

Now because of point #2, I looked closer. I was sort of surprised to see that the correlation was almost a perfect -1….the line went almost straight from (0,50) to (50,0).  But that didn’t make much sense….why are both axes using the same set of numbers? That’s when I looked at the labels and realized they were both ranks, not absolute numbers. Now for gun laws, this makes sense. You can’t count number of laws due to variability in the scope of laws, so you have to use some sort of ranking system. The gun control grade (the color) also gives a nice overview of which states are equivalent to each other. Not bad.

For gun deaths on the other hand, this is a little annoying. We actually do have a good metric for that: deaths per 100,000.  This would help us maintain the sense of proportion as well.  I decided to grab the original data here to see if the curve change when using the absolute numbers.  I found those here.   This is what I came up with:

Gun graph 2

Now we see a more gradual slope, and a correlation of probably around -.8 or so (Edited to add: I should be clear that because we are dealing with ordinal data for the ranking, a correlation is not really valid…I was just describing what would visually jump out at you.). We also get a better sense of the range and proportion.  I didn’t include the state labels, in large part because I’m not sure if I’m using the same year of data the original group was.1

The really big issue here though, is that this graph with it’s wonderful correlation reflects gun deaths, not gun homicides….and of course the whole reason we are currently having this debate is because of gun homicides. I’m not the only one who noticed this, Eugene Volokh wrote about it at the Washington Post as well. I almost canned this post, but then I realized I didn’t particularly like his graph either. No disrespect to Prof Volokh, it’s really mostly that I don’t understand what the Brady Campaign means when it gives states a negative rating.  So I decided to plot both sets of data on the same graph and see what happened.  I got the data on just gun homicides here.

Gun graph 3

That’s a pretty big difference.  Now I think there’s some good discussion to have around what accounts for this difference – suicides and accidents – and if that’s something to take in to account when reviewing gun legislation, but Volokh most certainly handles that discussion better than I.  I’m just a numbers guy.

 

1. I also noticed that Slate flipped the order the Law Center to prevent Gun Violence had originally used, so if you look at the source data you will see a difference. The originally rankings had 1 as the strongest gun laws and 50 as the weakest. However, Slate flipped every state rank to reflect this change, so no meaning was lost. I think it made the graph easier to read.

Buzzfeed or Research Study?

The Telegraph has a report on a new study  that attempts to divide people in to 4 different types of drinkers, based on how alcohol affects them.  The four types are:

  1. Hemingway
  2. The Nutty Professor
  3. Mary Poppins
  4. Mr Hyde

My first thought was “this sounds like a Buzzfeed quiz”.  So I went looking, and found that yes, Buzzfeed has actually done this quiz.  Oddly, the Buzzfeed version has way more boring names for their classifications.  OTOH, they probably used more interesting gifs…though to be fair I haven’t seen the study questionnaire to verify.

When I went to actually read the study, I realized that they actually kicked it off by citing Buzzfeed-esque clickbait headlines.  So basically, a study inspired by Buzzfeed headlines ends up sounding like a Buzzfeed headline, and the research version was more creative than the Buzzfeed version. Whoa.

Rock, Paper, Crayon

Recently in a conversation with a friend of mine, I mentioned a paper I had read that asserted that we are more attracted to potential partners who look like us1.  I couldn’t remember all of the details, so a googling I went.  Before I found the actual paper, I found an article about it that contained the following sentence:

For the study, photographs of real-life couples were also studied and analyzed by the researchers with at least one child, to determine if these actually influenced partner choices.

There’s a sentence that leaves you with an amusing mental image.  Did the researchers have to have at least one child?  Did a child help with the analysis? Who is this child and why are they involved so specifically?

When I found the paper, I confirmed that in fact it was the couples being analyzed who had at least one child….but I still like my mental image of a kid with a crayon, marking up all the photos with his observations on face structure.
1. Because it would annoy the crap out of me to read a reference like that with no link to the original paper, here you go.

Cool kids and linguistic pragmatism

Yesterday a facebook friend of mine put up an angry post regarding misuse of the word “decimate”.  His chief complaint was that people used it as a synonym for destroy, when really it meant a reduction of 10% or so.  That cleared up the “deci” part of the word for me, but I was surprised that the proper definition was so narrow….so of course I went to dictionary.com to check his facts.

Turns out the “one in ten” definition is specifically marked as obsolete.  The current accepted definition is merely “to destroy a great number of”.  So basically it can’t be used to sub in for obliterate, but the 10% definition was only valid through the year 1600 or so.  Sigh.

I’m not a big fan of people who try to get too cute when picking on the language of others.  While I certainly am irritated by some of the more obvious errors in language (irregardless makes me cringe, and please don’t mix up “less” and “fewer” in my presence), I dislike when people go back several hundred verbal years and then attempt to claim that’s the “proper” way of doing things.  This annoys me enough that my brother bought me this book a few years ago, just to help me out.  I believe language will always be morphing to a certain extent, and while rules are good we just need to accept that all language is pretty much arbitrary.  Thus, I refer to myself as a linguistic pragmatist.  Adhere to the rules, but accept that sometimes society just moves on.

Why am I bringing this up?  Well, after going through that internal rant, I found it very interesting that this study is being reported with the headline “Popular kids who tortured you in high school are now rich“.

Basically, researchers assessed how popular kids were in high school, based on how many people gave you “friendship nominations” and found that those in the top 20% made 10% more money 40 years later than those in the bottom 20%.

Now I think this makes a certain amount of sense.  While the outcast nerd makes good story is appealing, it stands to reason that many of the least popular kids in high school might be unpopular because of real issues with social skills that hurt them later in life (to note, social skill impairment is a co-morbidity with all sorts of things that could make this worse….ADHD, depression, etc).  Conversely of course, those with more friends probably have skills that help them maintain networks later.  Basically, I think this study tells us that the number of friends you have in high school isn’t totally random.

My issues with the reporting/reading of this study is in the semantics.  I think there’s a disconnect between our common interpretation of “popular in high school” and the actual definition of “popular in high school”.  The researchers in this study weren’t assessing the kids other kids aspired to be, they were assessing the kids who actually had lots of friends and were well liked.  While the classic football player who beats up kids in the locker room may get referred to as a popular kid, it’s likely he would not have had many people naming him as a friend on a survey.  So basically, the study had a built in control for those kids who were temporarily at the top of the social ladder, but lacked actual getting along with people skills.  I had an incredibly small high school class (<30) and I could name several kids who fell in the "perceived popular" category but not the "actually popular" category.

All this to come back to my original point.  Words mean different things depending on context, and this should always be taken in to account when assessing research and reading subsequent report.  It’s not bad data, just a different set of definitions.

Elections and small sample sizes

XKCD hits the nail on the head yet again with a great commentary on election year “no one has ever _____ and won the White House” musings.

These drive me nuts because obviously we have an incredibly small sample size.  Our country may have been around for quite some time now, but we’ve only had 44 presidents.  Think about how few people that really is.

Additionally, states change, demographics change, and the electoral college system is ridiculous.  This gives rise to all sorts of statistical “anomalies” that really are quite probable when you think of how few events we’re looking at.

The sports world does this too, baseball probably more than the rest of them.  While watching the post season this year with my long suffering Oriole’s fan husband, we got quite a kick out of pointing out how specific some of the stats they brought up were.  “He’s 1 for 3 when facing Sabathia during the post season over the last 3 years”.  Four at bats over a whole career and we’re supposed to draw some sort of conclusion from this?  Sigh.

Anyway, here’s the comic.  Happy Thursday.

Lance Armstrong and False Positives

Well the talk went well.

I’m waiting for the official rating (people fill out anonymous evals), but there seemed to be a lot of interest….and more importantly I got quite a few compliments on the unique approach.  Giving people something new in the “how to get along” genre was my goal, so I was pleased.

Between that and having 48 hours to pull together another abstract for submission to a transplant conference, posting got slow.

It was interesting though….the project I was writing the abstract was about a new test we introduced that saved patients over an hour of waiting time IF it came out above a certain level.  We had hours of discussion about where that should be, ultimately deciding that we had to minimize false positives (times when the test said they passed but a better test said they failed) at the cost of driving up false negatives (when the test said they failed, but they really hadn’t).  We have to perform the more accurate test regardless, so it was a choice between having a patient wait unnecessarily, or having them start an expensive uncomfortable procedure unnecessarily.  Ethically and reasonably, we decided most patients would rather find out they’d waited when they didn’t have to than that they’d gotten an entirely unnecessary procedure.

I bring all this up both to excuse my absence and to say I was fascinated by Kaiser Fung’s take on Lance Armstrong.  He goes in depth about anti-doping tests, hammering on the point that testing agencies will accept high false negatives to minimize false positives.  It would ruin their credibility to falsely accuse someone, so we have to presume many many dopers test clean at various points in time.  It follows then, that clean tests mean fairly little, while other evidence means quite a lot.

I thought that was an interesting point, one I had certainly not heard covered.

Also, as any Orioles fan (or someone who lives with one) would know, I have good reason to want Raul Ibanez tested right now.

More posts this week than last, I promise.

Pacifiers and baby boys

I’m a bit behind on this one, but this study was too interesting to pass up.

Apparently, research suggests that pacifier use by boys limits their social development.

So we’ll start with the bias alert.  I have a baby boy, and he does use a pacifier to help him go to sleep.  I didn’t have any particular feelings about this, I just gave it a whirl and liked the way it helped him calm down when he was tired.  Give it 5 minutes, and he tends to spit it out and go to sleep.  That seemed rational to me, I actually was unaware there was much controversy about this until I got reading this article (reiterating Dubbahdee’s point that I should never read parenting advice on the internet….oops).

Obviously, I don’t yet know what his social development is going to turn out like (though at the moment he’s astoundingly unsympathetic to my lack of sleep), but I generally hope it’s okay.   End bias alert.

It took me a while to find the actual paper (why oh why do so many news sources not link to the actual paper????), but after scanning the whole thing I had a couple thoughts.

The headlines about this paper were stupid, of course.  The author actually had a pretty good theory based on actual science (babies learn emotions in part through mimicry, she wondered if a pacifier would make this harder for babies because their facial muscles were occupied), and of course it got over reported. Most headlines just mentioned “pacifier use” in general, but she clarifies pretty quickly that they only studied pacifier use during baby wake time….specifically excluding the type of pacifier use I described above (as a sleep aid).  This makes sense (the woman does have 3 boys herself after all) because you don’t have to spend very long around babies before you realize they’re probably not learning much when they’re trying to fall asleep.  They’re mostly just crying.

Anyway, the set up for the study was pretty good.  They assessed both 6 and 7 year olds and their emotional reactions vs pacifier use, and then later college students who were questioned about their history of pacifier usage to tie it to adult development.

For that second, I was curious about the length of pacifier use we were talking about, as this was based on the recollection of college students and their parents, and I was wondering how accurate that would be.  This graph sums it up nicely:

I’m not familiar with the emotional intelligence scale they’re using, so I’ll take their word for it that 4.7 to 4.4 is statistically significant….but wow, daytime use of a pacifier until 5 years of age?  That does seem like it should cause some concern.  Also, it seems as those the recollection bias here would be clustered at either end.  Parents would remember more accurately either remarkably short or remarkably long pacifier use…but that’s just a guess.

Overall, I thought it was annoying that “daytime use of pacifiers until kindergarten” got labeled as just “pacifier use”, but I thought the research was certainly intriguing.  I especially liked that they tested both younger children and adults to help prove their theory, as emotional development is most definitely a complex process that takes decades to work through.

What I actually liked about this study the most was Ann Althouse’s take on it.  She wondered if this meant you could stop overly emotional women from being overly emotional by giving them Botox so they couldn’t mimic those around them.  I’d say it’s worth a shot.