What I Wish I Was Reading: December 2017

With guests at the house, a sick kiddo and snow in the forecast, I have had no time to read this new paper on how regional temperature affects population agreeableness. I will be doing so soon however, because as someone who’s heard a lot about how unfriendly Boston is I’d like some validation for my go to “we’re rude because we’re cold” excuse.

Funny story: when my out of town guests picked up their (4 wheel drive) rental car, the lady behind the counter mocked them and said “expecting some snow or something”? When they got to my house and we confirmed that there is actually snow in the forecast, they wondered why she was so condescending about it. We explained that for Bostonians, a forecast of 4-6 inches over 20 hours isn’t really “snow”. They informed me that in Seattle, they’d be calling out the National Guard.

Also, my sister-in-law (married to my teacher/farmer brother) has informed me her new parenting slogan is “There’s no such thing as bad weather, only bad clothes” we she apparently got from this book of the same name. I like this theory. It goes nicely with my adulthood slogan of “There’s no such thing as strong coffee, only weak people.”

I hope to have a review of the paper up on Wednesday this week, stay tuned.

The Assistant Village Idiot linked  to this article (via Lelia) about those with no visual memory. I’ve been pondering this as I’m pretty sure my visual memory has some gaps.  I can’t read facial expressions baseline, and one of my recurring stress nightmares is being handed documents/books that I recognize but can’t decipher the text. I feel something’s related here, but I have to reread the article before I comment further.

Also, I know I always chide people to read behind the headline, but this headline’s so good I’m pretty sure I’ll love it when I finally get to read it: 5 Sport Science Studies that “Failed”. The author specifically took note of studies he saw that asked interesting questions and got negative results. He wanted to talk about this to fight the impression that the only interesting findings were positive findings.

Work Hours and Denominators

I was talking to a few folks about work recently, and an interesting issue came up that I’d never particularly thought about before. I’ve mentioned in the past the power of denominators to radically change what we’re looking at, how averages might be lying to you, and how often people misreport “hours worked”…..but I don’t know that I’d ever put all 3 together.

When answering the question “how many hours do you work per day”, most full time workers generally name a number between 8 and 10 hours a day. Taken literally though, it occurred to me that the answer is really somewhere between 6 and 7 hours on average, as most people aren’t working on the weekends. So basically when asked “average hours per day” we assume “average hours per working day” and answer accordingly.

This is a minor thing, but it actually is part of why the actual “hours worked” numbers and the reported “hours worked” numbers don’t always add up. When people try to figure out how much the average American is working, they take things like vacation/holiday weeks in to account. The rest of us don’t tend to do that, and instead report on how much we worked when we worked full weeks. A slight shift in denominator, but it can end up shifting the results by a decent amount.

Not the most groundbreaking insight ever, but I always get a little interested when I realize we’ve all been assuming a denominator that’s not explicitly clarified.

Cornell Food and Brand Lab: an Update

After mentioning the embattled Brian Wansink and the Cornell Food and Brand Lab in my post last week, a friend passed along the most recent update on this story. Interestingly it appears Buzzfeed is the news outlet doing the most reporting on this story as it continues to develop.

A quick review:
Brian Wansink is the head of the Cornell Food and Brand Lab, which publishes all sorts of interesting research about the psychology behind eating and how we process information on health. Even if you’ve never heard his name, you may have heard about his work….studies like “people eat more soup if their bowl refills so they don’t know how much they’ve eaten”  or “kids eat more vegetables when they have fun names” tend to be from his lab.

About a year ago, he published a blog post where he praised one of his grad students for taking a data set that didn’t really show much and turning it in to 5 publishable papers.  This turned in to an enormous scandal as many people quickly pointed out that a feat like that almost certainly involved lots of data tricks that would make the papers results very likely to be false.  As the scrutiny went up, things got worse as now people were pouring over his previous work.

Not only did this throw Wansink’s work in to question, but a lot of people (myself included) who had used his work in their work now had to figure out whether or not to retract or update what they had written. Ugh.

So where are we now?
Well as I mentioned, Buzzfeed has been making sure this doesn’t drop. In September, they reported that the aforementioned “veggie with fun names” study had a lot of problems. Worse yet, Wansink couldn’t produce the data when asked.   What was incredibly concerning is that this particular paper is part of a program Wansink was piloting for school lunches. With his work under scrutiny, over $20 million in research and training grants may have gone towards strategies that may not actually be effective. To be clear, the “fun veggie name study” wasn’t the only part of this program, but it’s definitely not encouraging to find out that parts of it are so shaky.

To make things even worse, they are now reporting that several of his papers that allegedly were done on three different topics in three different years sent to three different sample populations show the exact same number of survey respondents: 770. Those papers are being reviewed.

Finally, the report he has a 4th paper being retracted, this one on WWII veterans and cooking habits. An interview with the researcher who helped highlight the issues with the paper is up here at Retraction Watch, and some of the problems with the paper are pretty amazing. When asked where he first noted problems, he said: “First there is the claim that only 80% of people who saw heavy, repeated combat during WW2 were male.”  Yeah, that seems a little off. Wansink has responded to the Buzzfeed report to say that this was due to a spreadsheet error.

Overall, the implications of this are going to be felt for a while. While only 4 papers have been retracted so far, Buzzfeed reports that 8 more have planned corrections, and over 50 are being looked at. With such a prolific lab and results that are used in so many places, this story could go on for years. I appreciate the journalists keeping up on this story as it’s an incredibly important cautionary tale for members of the scientific community and the public alike.

Food Insecurity: A Semester in Review

I mentioned a few months back that I was working on my capstone project for my degree this semester. I’ve mostly finished it up (just adjusting some formatting), so I thought it would be a good moment to post on my project and some of my findings. Since I have to present this all in a week or two, it’s a good moment to gather my thoughts as well.

Background:
The American Time Use Survey is a national survey carried out by the Bureau of Labor Statistics that surveys Americans about how they spend their time. From 2014-2016 they administered a survey module that asked specifically about health status and behaviors. They make the questionnaire and data files publicly available here.

What interested me about this data set is that they asked specifically about food insecurity….i.e. “Which of the following statements best describes the amount of food eaten in your household in the last 30 days – enough food to eat, sometimes not enough to eat, or often not enough to eat?” Based on that data, I was able to compare those who were food secure (those who said “I had enough food to eat”) vs the food insecure (those who said they “sometimes” or “frequently” did not have enough to eat.

This is an interesting comparison to make, because there’s some evidence that in the US these two groups don’t always look like what you’d expect. Previous work has found that people who report they are food insecure actually tend to weigh more than those who are food secure. I broke my research down in to three categories:

  1. Confirmation of BMI differences
  2. Comparison of health habits between food secure and food insecure people
  3. Correlation of specific behaviors with BMI within the food insecure group

Here’s what I found:

Confirmation of BMI differences:
Yes, the paradox is true for this data set. Those who were “sometimes” or “frequently” food insecure were almost 2 BMI points heavier than those who were food secure…around 10-15 pounds for most height ranges. Level of food insecurity didn’t seem to matter, and the effect persisted even after controlling for public assistance and income.

Interestingly, my professor asked me if the BMI difference was due more to food insecure people being shorter (indicating a possible nutritional deficiency) or from being heavier, and it turns out it’s both. The food insecure group was about an inch shorter and 8 lbs heavier than the food secure group.

Differences in health behaviors or status:
Given my sample size (over 20,000), most of the questions they asked ended up having statistically significant differences. The ones that seemed to be both practically and statistically significant were:

  1. Health status People who were food insecure were WAY more likely to say they were in poor health. This isn’t terribly surprising since disability would impact people’s assessment of their health status and ability to work/earn a living.
  2. Shopping habits While most people from both groups did their grocery shopping at grocery stores, food insecure people were more likely to use other stores like “supercenters” (i.e. Walmart or Target) and convenience stores or “other” types of stores. Food secure people were more likely to use places like Costco or Sam’s Club. Unsurprisingly, people who were food insecure were much more likely to say they selected their stores based on the prices. My brother had asked specifically up front if “food deserts” were an issue, so I did note that the two groups answered “location” was a factor in their shopping at equal rates.
  3. Soda consumption Food insecure people were much more likely to have drank soda in the last 7 days (50% vs 38%) and much less likely to say it was a diet soda (40% vs 21.5%) than the food secure group.
  4. Exercise Food insecure people were much less likely to have exercised in the last 7 days (50.5%) than food secure people were (63.9%). Given the health status ranking, this doesn’t seem surprising.
  5. Food shopping/preparation Food insecure people were much more likely to be the primary food shopper and preparer. This makes sense when you consider that food insecurity is a self reported metric. If you’re the one looking at the bills, you’re probably more likely to feel insecure than if you’re not. Other researchers have noted that many food stamp recipients will also cut their own intake to make sure their children have enough food.

Yes, I have confidence intervals for all of these, but I’m sparing you.

BMI correlation within the food insecure group:
Taking just the group that said they were food insecure, I then took a look at which factors were most associated with higher BMIs. These were:

  1. Time spent eating Interestingly, increased time spent eating was actually associated with lower BMIs. This may indicate that people who can plan regular meal times might be healthier than those eating while doing other things (the survey asked about both).
  2. Drinking beverages other than water Those who regularly drank beverages other than water were heavier than those who didn’t
  3. Lack of exercise No shock here
  4. Poor health The worse the self assessed health, the higher the BMI. It’s hard to tease out the correlation/causation here. Are people in bad health due to an obesity related illness (like diabetes) or are they obese because they have an issue that makes it hard for them to move (like a back injury)? Regardless, this correlation was QUITE strong: people in “excellent” health had BMIs almost 5 points lower than those in “poor” health.
  5. Being the primary shopper I’m not clear on why this association exists, but primary shoppers were 2 BMI points heavier than those that shared shopping duties.
  6. Public assistance  Those who were food insecure AND received public assistance were heavier than those who were just food insecure.

It should be noted that I did nothing to establish causality here, everything reported is just an association. Additionally, it’s interesting to note a few things that didn’t show up here: fast food consumption, shopping locations and snacking all didn’t make much of a difference.

While none of this is definitive, I thought it was an interesting exploration in to the topic. I have like 30 pages of this stuff, so I can definitely clarify anything I didn’t go in to. Now to put my presentation together and be done with this!

 

Eating Season

Happy almost Thanksgiving! Please enjoy this bit of trivia I recently stumbled on about American food consumption patterns during this time of year! It’s from the book “Devoured: From Chicken Wings to Kale Smoothies – How What We Eat Defines Who We Are” by Sophie Egan.

From page 173:

A few paragraphs later, she goes a bit more in depth about what happens to shopping habits (note: she quotes the embattled Cornell Food and Brand lab, but since their data matches another groups data on this, I’m guessing it’s pretty solid):

I had no idea that “eating season” had gone so far outside the bounds of what I think of as the holiday season. Kinda makes you wonder if this is all just being driven by winter and the holidays are just an excuse.

On a related note, my capstone project is done/accepted with no edits and I will probably be putting up some highlights about my research in to food insecurity and health habits on Sunday.

Happy Thanksgiving!

Grade Prediction by Subject

I saw an interesting study this week that seems to play in to two different topics I’ve talked about here: self reporting bias and the Dunning-Kruger effect.

The study was “Examining the accuracy of students’ self-reported academic grades from a correlational and a discrepancy perspective: Evidence from a longitudinal study“, and it took a look at how accurate students self-reported grades were. This is not the first time someone has looked at this, but it did add two key things to the mix: non-US test scoring and different academic subjects over different years of school. The students surveyed were Swiss, and they were asked to report their most recent grade in 4 different subjects. This was then compared to their actual most recent grade. The results were pretty interesting (UR=under-report, OR=Over-report, T1-T3 are years of school):

Unsurprisingly, kids were much more likely to over-report than under-report. Since most of the differences were adding a half point or so (out of 6), one wonders if this is just a tendency to round up in our own favor. Interestingly, a huge majority of kids were actually quite honest about their ability….about 70% for most years. The authors also noted that the younger kids were more likely to be honest than the older kids.

I think this is a really interesting example of how self-reporting biases can play out. It’s easy to think of bias as something that’s big and overwhelming, but studies like this suggest that most bias is small for any given individual. A rounding error here, and accidental report of your grade from last semester….those are tiny for each person but can add up over a group. I suspect if we looked at those older students who reported their grades as inaccurately high, we would discover that they had gotten high grades in previous years. There does seem to be a bias towards reporting your high water mark rather than your current status….kinda like the jock who continues to claim they can run a 5 minute mile long after they cease to be able to do so.

The phenomena is pretty well known, but it’s always interesting to see the hard numbers.

Millenials and Communism

I was perusing Twitter this past weekend when I started to see some concerning headlines float by.

Survey: 1 in 2 millennials would rather live in a socialist or communist country than capitalist one

Millenials think socialism would make a great safe space

Nearly 1 In 5 Millennials Consider Joseph Stalin And Kim Jong Un ‘Heroes’

While I could see a survey of young people turning up with the socialism result, that last headline really concerned me. At first I thought it was just a case of “don’t just read the headline“, but all the articles seemed to confirm the initial statistic. AOL said “a lot of them see Joseph Stalin and Kim Jong Un as “heroes.”” Fox News hit on my discomfort when they said “The report also found that one in five Americans in their 20s consider former Soviet dictator Joseph Stalin a hero, despite his genocide of Ukrainians and Orthodox priests. Over a quarter of millennials polled also thought the same for Vladimir Lenin and Kim Jong Un.”

Seriously?

While I know polls frequently grab headlines by playing on people’s political ignorance, this seemed to go a step beyond that. I had trouble wrapping my head around the idea that anyone in the US could list Stalin, Lenin or Jong-Un as a hero, let alone 20-25%. I had to go see what question prompted such an odd set of results.

The overview of the poll results is here, and sure enough, the question that led to the results is worded a little differently than the article. Here’s the screenshot from the report, blue underlines/boxes are mine:

I think the “hero for their country” part is key. That asks people to assess not just their own feelings, but what they know about the feelings of a whole other country.

Interestingly, I decided to look up Kim Jong-un’s in-country approval rating, and some defectors put it as high as 50%.  According to one poll, 38% of Russians consider Josef Stalin to be the “most outstanding person” in world history. You could certainly debate if those polls had problems in wording, sample or other methodology, but the idea that a 25 year old in the US might see a headline like that and conclude that Russians really did like Stalin doesn’t seem outside the realm of possibility. Indeed, further down the report we find out that only 6% of millenials in the US state that they personally have a favorable view of Stalin. That’s lizard people territory folks.

In this case, it appears the polling company was actually pretty responsible about how they reported things, so it’s disappointing that further reports dropped the “in their country” piece. In my ongoing quest to name different biases and weird ways of skewing data, I’m now wondering what to name this one. What do you call it when someone asks a poll question in a way that encompasses a variety of scenarios, then the later reports shorten the question to make it sound like a different question was answered? I’m gonna work on this.

Daylight Saving (is not the worst of evils)

Well hi there! At this point on Sunday, I’m going to assume you’ve remembered that your clock should have been set back last night. With the advent of cell phones and auto-updates, I suspect the incidence of “showing up to church an hour early because no one remembered daylight saving time” has dropped precipitously since I was a kid.

Growing up, daylight saving time was always the subject of some debate in my house. My dad is a daylight saving time defender, and takes a lot of joy in pointing out to people that no matter how irritated you are by the time change, not changing the time would be even more annoying.

To support his point, I found this site that someone posted on Facebook rather interesting. It’s by a cartographer, and it lets you see the impact of Daylight Saving on the different regions of the country. It also lets you monkey around with different schemes….eliminate daylight saving vs impose it permanently vs keep the status quo…and see what impact they’d have on the sunrise/sunset times. (Note: he created it in 2015, so some numbers may not reflect the 2017 time changes)

My Dad’s point was always that daylight saving blunts the extremes, so I tried out a few different schemes to see how often they made the sunrise very early vs very late. For example, here’s how many days the sun would rise before 5am in different regions if we keep things status quo vs eliminate daylight saving vs always use it:

If you go to the actual website and hover, you can get the exact number of days those colors represent. If we did away with daylight saving, my region of the country would have over 120 days of pre-5am sunrises. I’m an early riser, but that seems a little much even for me.

Here’s how it would effect post-8pm sunsets:

So basically my Dad was right. If you want lots of early sunrises, push to abolish daylight saving. I think most people sort of know that’s what the time change thing is all about, but it is interesting to see exactly how many early sunrises we’re talking about. When you consider that the sky starts to lighten half an hour before sunrise, you realize that getting rid of daylight saving is signing yourself up for a LOT of early morning sunshine.

I think the real PR problem here is that the time changes happen so far away from the extremes that people forget that it’s really designed to help mitigate situations that would occur several months later. I think there’s a new bias name in here somewhere.

Probability, Don’t You Mess With Me

In honor of Halloween, please enjoy the only stats based/horror 80s music video I know of:

 

From 3-2-1 Contact, one of the more formative shows of my childhood.

The Weight of Evidence

I’ve been thinking a lot about the law and evidence this week, for 3 reasons:

First, this article my lawyer father sent me about the Supreme Court’s aversion to math. It reviews a case about gerrymandering  I’ve mentioned before, and the attempts of statisticians/computer guys to convince the court that their mathematical  models are worth using. While the case hasn’t been decided yet, some researchers were fairly annoyed at how reflexively some of the justices dismissed the models presented, and their invocation of the “gobbledygook” doctrine.

Second was this article I stumbled on that discussed an effort to fact-check supreme court decisions, and found a rather alarming number of them contain factual errors. This one was concerning for two reasons: some of the errors actually appeared to be related to the ultimate decision and some of the errors appear to have come from the Justices doing their own research.

Finally, this article about yet another evidence scandal in my state. Apparently our state lab has been systematically withholding evidence of failed breathalyzer calibrations, calling in to question hundreds of DUI convictions. This is not an aberration…for those of you not from around here, Massachusetts has been on a bad run with our state crime/forensics lab. This is our 3rd major scandal in the past few years, and we now have the dubious distinction of being cited in every report about the problems with forensics.

This got me thinking about a few things:

  1. The line between gobbledygook and “good idea, needs work” is often familiarity. In reading some of the Supreme Courts skepticism of mathematical models and contrasting it with the general acceptance of forensics despite serious concerns, it’s hard not to think that this has something to do with familiarity. Forensics is a science that was quite literally built to support the criminal justice system, whereas computer modeling was built to support….well, all sorts of things. I suspect that’s why one gets more scrutiny than the other.
  2. Mathematical models have to simplify and/or those who build them have prioritize explaining them to people who are not on their side The new wave of mathematical models is intriguing, exciting, and a little bit frightening all at once. Complexity is necessary at times, but ultimately can be used to hide assumptions and get your way. The justices on the Supreme Court know this, and their first suspicion is going to be that all that math is just there to hide something. Anyone hoping to build a model that effects policy should probably keep in mind that for everyone they impress, they will make someone else suspicious. As with any argument, trying it out on someone not inclined to agree with you will teach you a lot about where the holes might be.
  3. Lawyers need to learn more about statistics This one has been the subject of many long talks with my Dad. Unless they were required to take it for their undergrad degree, many lawyers can get through their whole higher ed career without touching a stats class. This seems like a gap to me, especially now that so much of the evidence they’re seeing requires some knowledge of probability and evidence. I’ve mentioned before that doctors struggle with the concept of false positives and false negatives and base rates,  and it seems clear many people in law enforcement do as well.  With all the new types of evidence out there, it seems like this is a gap.
  4. The Supreme Court needs a fact checker Seriously. Are you really telling me there’s not one clerk out there who would be willing to just read through the decisions and find citations for each stat? Or better yet, someone who’d read through each briefing filed with the court and error check them before they got to the Justices? In the case the article cited, the stat in question wasn’t a common controversial one (the % of workplaces that drug tested employees), but the answer provided (88%) apparently had no source at all. I feel like of all groups, the Supreme Court should have figured out how to get this stuff screened out before it biases them.

I am thinking there’s a presentation in here somewhere. If you have any more good articles, send them my way!