Stand Back! I’m going to try SCIENCE!

Today I discovered that my favorite webcomic (xkcd.com) actually has a special comic up if you check it from my employer’s server.  Turns out the artist’s wife is a patient, doing well, and he wanted to show some love.  This post is thus titled for this shirt, which would make an awesome Christmas present for me, even in April.

Anyway, this weekend I saw this story with the headline “Study: Conservatives’ Trust In Science At Record Low”.

My first thought on seeing this was that the word “science” is a loaded word.  I mean, I’m as much a science geek as anyone.  Math’s my favorite, but science will always be a close second.  But do I trust science? I’m not sure.  Something really bothered me about that question, but I couldn’t quite put my finger on it until I read this post on the study from First Things today.  

My love of science makes me a skeptic.  I makes me question relentlessly and then continuously revisit to figure what got left out.  I don’t trust science because not trusting your assumptions is science done right.  If we could all trust our assumptions, what would we need science for?  This is the problem with vague questions and loaded words.  Much like the discussion in the comments section of this post where several commenters weighed in on the word “delegate” in relation to household tasks, it’s clear that people will interpret the phrase “trust science” in many different ways.

Some might say it means the scientific method, scientists, science as a career, science’s role in the world, or something else not springing to mind.  Given the vagueness of the question though, I would have a hard time actually calling anyone’s interpretation wrong.  Mine is based on my own bias, but I would wager everyone’s is.  So isn’t this survey more about how we’re defining a phrase than about anything else?

I thought my annoyance was going to end there, I really did.

Then I looked at the graph with the story, and had no choice but to get annoyed all over again.

That’s what I get for just reading headlines.

So over the course of this survey, moderates have consistently trusted science less than conservatives for all but four data points?  Why didn’t this get mentioned?  I found the original study and took a quick look for the breakdown: 34% self identified as conservative, 39% as moderate, and 27% as liberal.  So 73% of the population has shown a significant drop off in “trust of science” and yet they’re somehow portrayed as the outliers?  Science and technology have changed almost unimaginably since 1974, and yet liberal’s opinions about all that haven’t changed*?  Does that strike anyone else as the more salient feature here?

*Technically this may not be true.  I don’t know what the self identified proportions were in 1974, so it could be a self-identification shift.  Still.  This might be that media bias everyone’s always talking about.

Arguments and Discussions…learning the rules

I was struck by something that commenter Erin mentioned in response to my post about data that I hate.   She ended her comments with this:

I teach this stuff to my AP students…I love trying to get them to understand how to break apart political rhetoric and other arguments around them. I figure even if we disagree wildly in politics or social issues, at least I’ll have an intelligent opponent to argue with someday. 

I like that, because I fully endorse that approach to life.  That’s part of why I wanted to do a blog like this.  Quite some time ago, the Assistant Village Idiot put up a post I liked very much (and can’t find now…circa 2007?) about how far too many people treated their political opinions as though they were defense lawyers….never giving an inch, never admitting that anything they had said or cited could be wrong or skewed.  This makes lots of people defend really stupid things.

In my office, this flowchart hangs just to the right of my computer:

I often have fantasies of taking it down during debates and serenely handing it to the other person whilst telling them to try again.  Sadly, I have never done this.  The fantasy keeps me going some days though, doubly so in political debates.
Though I’m probably preaching to the choir hear, I feel the need to state for the record:  Just because something you cited is wrong does not mean you are wrong.  You can keep your belief while also admitting that something that agrees with you is a load of crap.  That actually makes you a better person, not a worse one.  This is not an April fools joke, people actually can operate like this.

Say What?

I can’t figure out if this is the worst statistic I’ve read this week, or just the most poorly phrased good statistic.  I’m leaning towards that first one:

“….more than half of students earning bachelor’s degrees at public colleges – 56 percent – are graduating with $22,000 of debt, on average.”  –Nancy Zimpher on CNN.com

If I’m reading this correctly, they tossed out everyone at private college, then anyone who didn’t graduate, then (most disturbingly) 44% of who was left?  Why did they get the boot?  I mean, if we’re just tossing out arbitrary numbers of students, can’t we get any average we want?

Nancy Zimpher, what are you up to??? 

Why most nutritional research is useless

Nutrition research is big money these days.  Our national obsession with weight loss is at a fever pitch, and any new or interesting research is sure to make headlines.

Here’s some basic guidelines on what to look for in nutritional research (any study, not just this one):
  1. Was the data self reported? Even CNN brought this up in their article.  People, especially those embarrassed about their weight, don’t accurately assess what they eat.  My mother, skinny little thing that she is, could eat one peppermint patty and tell you she’d had a serving of chocolate.  I don’t think I’d even count it until I had 3 or so.
  2. How much was “more”? I actually can’t find this for this study.  Is it the difference between 1 and 2 servings per week?  Or the difference between 1 and 5?  Both would produce statistically significant correlations, but the practical outcome would be different.  In 2005, researchers made news by saying that eating more fruits and veggies did not, in fact, prevent cancer.  The cancer treating establishment (which I work in, btw) promptly responded by pointing out that they compared people who ate half a fruit per day to those who at 1-2 fruits per day.  It was all reported in grams too, so the data look extra impressive “Those eating less than 114 grams showed no difference from those eating 367 grams”.  The link gives more examples, but 250 grams is one medium apple.  Watch out for this.
  3. Who classified people as “normal weight” or “overweight”?  If this was also self reported (and in this study, it looks like there were clinic visits), then look out.  There’s a great study I can’t find right now that shows that women tend to lie about weight, and men tend to lie about height.  Both lies will screw up the BMI calculation (the most common metric for assessing “normal”).
  4. Were the overweight people actively (or even somewhat) trying to modify their diets to lose weight?  A few years ago, I heard about the study that suggested diet soda was linked to obesity.  I remember my first reaction was “are we sure they’re all not just on diets?”.  This seemed like a classic correlation/causation issue.  All the analysis seemed to presume they were overweight because they drank diet soda.  I wondered why they never seemed to look at the idea that they could be drinking diet soda because they were overweight.  That’s one of the first swaps most people I know make when they try to lose weight.  
  5. Don’t even get me started if it’s a population study.  That’s a big topic for another time, but lets just say they’re really really tricky.
If you ever want a fabulous crash course in how nutrition research can be skewed, pick up two diet books that contradict each other, and read through their parts on research.  Take something like Atkins (high protein, low carb) and Joel Fuhrman (nearly vegan), and watch them rip to shreds the research the other one builds their whole case on.  
He may have his own controversy, but this is why I like Michael Pollan.  The book I linked to has a great crash course in why most nutritional research just sees what it wants to.  He refused to take a strict nutritional stance and instead condensed it down to a few “rules” that he gleaned from quizzing nutritionists on “what they could say for sure”.  The answer? Eat real food, not too much, mostly plants.  

Blog Rules

Thanks to some links from the kind people at Assistant Village Idiot  and Maxed Out Mama, I have gotten a bit more traffic than I expected in the past two days.  As such, I realized it might be a good moment to spell out some of the rules for this blog I’ve had bouncing around in my head.  These are rules for me really, not for commenters, as no one can hope to tame the internet:

  1. I will try my best to provide a link for every study I cite, and this link will get as close to source data as possible.  Nothing drives me crazier than reading about “new research” with absolutely no clue as to where to find it.  I spent almost 20 minutes trying to find where the heck Jack Cafferty got his numbers for this article, and it made me mad.  I won’t do that to you.  And here are the numbers he reported on, as a sign of good faith.
  2. I will attempt to remain non-partisan. I have political opinions.  Lots of them.  But really, I’m not here to try to go after one party or another.  They both fall victim to bad data, and lots of people do it outside of politics too.  Lots of smart people have political blogs, and I like reading them…I just don’t feel I’d be a good person to run one.  My point is not to change people’s minds about issues, but to at least trip the warning light that they may be supporting themselves with crap.  That being said, if I start to lean to far to one side, smack me back to center.
  3. I will admit that I will probably fail at #2, and have lots of other biases as well.  What, you thought I was going to claim to be neutral?  No special snowflake here, we humans can’t help ourselves.
  4. I will, when I can, declare those biases up front.  When I review a study on changing last names, I think it’s relevant that I didn’t change mine.  When mentioning healthcare reform, I think it’s relevant that I live in the one state in the nation that won’t be affected by it either way.  It makes it easier 
  5. I will attempt to explain all stats words that are used.  I am not a stats teacher, I am just someone who uses a lot of data to get a job done.  I would love to do more than just preach to the choir, and thus I will try not to have any prereqs for this class.  For the very smart commenters I have here, this may get tedious, but bear with me.
  6. I will try to improve my use of apostrophe’s.  I’m really not good at those.
  7. Suggestions always welcome.  The internet is awesome because I get to learn from smart people I normally wouldn’t meet.  

THE KIND OF DATA I ABSOLUTELY HATE

Hate’s a strong word.  I get that.  I also get that data and survey types are not always the sort of thing that inspires people to strong hatred, but here we are.

In this post I mentioned my annoyance at perception/prediction polls.  The one I referenced was based on women who didn’t change their last names and their level of marital commitment.  Commenter Assistant Village Idiot mentioned another example, which I also liked ““Do you think earthquakes are more likely now because of climate change?” What we think has nothing to do with anything. The earthquakes will happen according to their own rules.”  

In writing that post however, I forgot to mention that same study included an even worse piece of data.  As a rebuttal to the “Midwestern college kids don’t think non-name changing women are committed” they included a remark that women who didn’t plan on changing their names didn’t feel less committed. 


I HATE STATEMENTS LIKE THAT.

I would really love it if someone could tell me if there’s a proper name for this sort of thing, but I always think of it as “the embarrassing question debacle”.  Basically, researchers ask people questions with a potentially embarrassing answer, and then report it as meaningful when people do not answer embarrassingly.

There are only two types of people I have ever heard who will admit they went in to their marriages less than completely committed:

  1. Those who have been married successfully for quite some time who are now comfortable in admitting they were totally naive when they walked down the aisle.
  2. Those who are already divorced and reflecting on what went wrong.
Level of commitment is best assessed in retrospect, and I look with great skepticism at anyone who says they can gauge it before the fact.  
Getting at the reasons people do things can be brutal.  Your only source for your data also has the biggest motivation to conceal it from you.  Some people are actually doing things for good reasons, some just want to look like they are, and some are lying to themselves.  Unless a study at least attempts to account for all 3 scenarios, I would hold all answers suspect.

It’s not the question, it’s how you ask it

Data gathering is a lot harder than most people imagine.  It’s an interesting exercise to take a study and prior to reading it start asking yourself “how would I, if pressed, get the data they claim to have gotten?”.  It’s amazing how many fall apart quickly when you realize how bad the source data is.

I face this all the time at work.  The simplest questions…what is our demand for transplants? can be a never ending labyrinth of opinion, observation, anecdote, and data….all completely enmeshed.  I spend much of my day trying to untangle these strings, and I never underestimate how difficult getting a simple answer can be.

Factcheck.org ran a great piece today illustrating this challenge.  In a post titled “How Many Would Repeal Obamacare?”  they review 4 different surveys that all try to get to the same number: how many people think healthcare reform should be repealed?

It’s a great article that covers sampling practices, question phrasing, date of the poll, and history of the polling organization.

If you looking at the numbers, it shows up pretty quickly that when given dichotomous choices (repeal/keep), people often look like they gave a strong opinion.  In the polls where more moderate answers are given (“it may need small modifications, but we should see how it works), people trend towards that answer.

The phrasing was extremely intriguing though:
“Turning to the health care law passed last year, what is your opinion of the law?”
“If a Republican is elected president in this November’s election, would you strongly favor, favor, oppose, or strongly oppose him repealing the healthcare law when he takes office?”
“Do you think Congress should try to repeal the health care law, or should they let it stand?”

In one, the question focuses on personal opinion, in the next the focus is the presidency, in the third it’s Congress.  All of this for a law that most Americans have yet to feel the effects of in any practical way.

Of course this is not to say that a public opinion poll (or 4) makes one side right or wrong. If constitutionality or effectiveness are your concern, nothing here addresses either.  I am enjoying it immensely for the educational value though, and kind of wishing I was teaching a class so I could use this as an example.  Those of us in Massachusetts do have the luxury of sitting back and just sort of pondering all of this….as this has been our world for 7 years now.

That reminds me….were these samples controlled for that????

Correlation and Causation: the Housework Edition

After yesterday’s comic, I was hoping to find a good example of a news story where they equated correlation and causation.  In case you’re curious, it took me under 5 minutes.

Headline: Why Being Less of a Control Freak May Make You Happier

To start, let me just mention that correlation implies that two things are moving together….as one goes up, so does the other.  Alternatively, as one goes up, the other goes down, or vice versa.  Either way, their outcomes appear to be tied.

Causation on the other hand, says that one thing is causing another.  What yesterday’s post was referring to is the often made mistake that just because two things are correlated, we can infer that one is causing the other.  This is not always true, and believing so may get you drawn as a stick figure.  

Anyway, the article above illustrates that point nicely.  The author set out to find out if being a control freak mom made people unhappy….and low and behold it appears to.  55% of women who said they delegate to a partner or spouse at least once a week reported themselves as “very satisfied” with their life.  For those who did not delegate that often, the number was 43%.  

Now, I’ll mostly skip the use of the word “delegate” in this article, though it does bother me.  My husband does plenty around the house, but we mostly just consider that “teamwork” not “delegating”.  I don’t start the week handing out tasks to him, and he doesn’t consider the work he does around the house a favor to me.  It’s just what needs to get done.

More importantly however, is the articles conclusion that delegating will make people happier.  While delegating and happiness are perhaps correlated, they are not necessarily causal.  It’s possible that the women who don’t delegate do so because their spouse is lazy, hostile, or generally not involved….all things which would also make them less happy over all.  It’s also possible that women who don’t delegate are controlling, martyr’s, passive aggressive, etc, and that makes them unhappy too.

I had a great stats professor once who opened every class with this:

“If you get one thing out of this class, let it be this:

When x and y or correlated, you have 3 possibilities:

  1. X is causing Y
  2. Y is causing X
  3. Something else is causing both X and Y “
Lack of delegating could cause unhappiness.
Unhappiness could cause people to stop delegating.
Something else entirely could cause people to not delegate and to be unhappy.

When in Doubt, Blame the Journalist

Within minutes of hitting “publish post” on my mission statement, I found an article that reminded me of one of my worst pet peeves when it comes to data/science/studies of all types.  The headline read  “Keeping Your Name? Midwesterners Are Judging You”.  My ears (eyes?) perked up at this headline, as I am among those women who declined to change her name post-nuptial.  Despite knowing that Jezebel is not often the best place for unbiased reporting, I gave it a read.  

The article linked to a much more well nuanced article here, but the basics are as follows: students at a small midwestern college feel that women who don’t change their last names when they get married are less committed to their relationships than those who do.  This was interesting in part because the number of people who felt negatively about this quadrupled between 1990 and 2006.  
For the personal reasons listed above, I find this interesting.  However, when you look at the numbers (2.7% of 256 and 10.1% of 246 which Jezebel did include) and do a little math, you realize that this “jump” is a difference of 18 people.  
A few things to consider about this:
  1. I couldn’t find that this was published anywhere.  It seemed to be a sort of “FYI for the headlines”.
  2. Apparently there’s no data on whether or not this perception is true.  My bias would be that it’s not, but I couldn’t find data actually saying if the perception was correct.  This happens in many “perception” studies….they quote percentages who believe something with the implication that a certain belief is wrong without ever proving it.
  3. There wasn’t a gender breakdown of who those 18 people were.  If most were female, then isn’t their perception likely to be based on experience?  As in “well if I didn’t do it, it would be because I wasn’t committed”?  That not judgement of others, that’s judgement of self.
  4. Have any of their professors (or TV shows, or other media sources) recently made disparaging remarks about this?  18 people who all very well might know each other (the university surveyed was under 1000 students) could easily be influenced in their answer  by even one strong source.
  5. As college students, presumably very few of those polled were actually married.  From my experience in college, I would conjecture that this is a phase of life during which people are very idealistic regarding their future mates without having many real experiences to back it up.  I put much more stock in what people who are actually married use to feel out level of commitment than what someone who’s never walked down that aisle thinks.
All that being said, it looked like the study authors were careful to address several of these points (especially the “this is not a representative sample” point.  It was only in the translation that conclusions were drawn that were more dubious.  
Scientists have very little incentive to exaggerate the meaning of their findings.  They are in a profession where that could be very damaging.  Reporters for both old and new media have EVERY incentive to spin things in to good headlines.  Remember that.

Mission Statement

Numbers never lie.  

Unlike people, who are constantly confused by their own biases and perspectives, numbers behave….if you know how to use them.  
This is what I do for work every day:
First, I get what management thinks is the problem.  Second, I talk to the people involved and find out what they think is the problem.  Third, I get to retreat in to the numbers.  I spend time looking at what we’re doing, where we are, and where we’d need to be for everyone to be happy.  It’s the third part that’s my favorite.  No one argues, no political pressure, just puzzles, problems, and unexpected truths.  
I use data every day to help improve health care, and I’ve been pretty successful at it so far.  As I look around though, I realize how few people really understand the importance of good data in our lives.  One needs look no further than election year politics to see bad data, poor interpretations of good data, and blatant misuses that make me cringe.  In the healthcare realm, we don’t have this luxury.  I come from a world where you can’t take chances, where misrepresenting your stats can result in very real human suffering.  
This is why improper uses of data drive me nuts.  Once you know what to look for, it’s hard to stop seeing it. It’s everywhere.  Thus, I am giving myself an ambitious goal.  It’s no longer enough for me to use good data science for my own purposes.  I want to educate others, and hone my own skills along the way.  I want people to know what research is, how to read it, and how to question it.  I want others to be as passionate as I am, and I want a place to vent about the reporting that annoys me.  
Stay tuned.