Work Hours and Denominators

I was talking to a few folks about work recently, and an interesting issue came up that I’d never particularly thought about before. I’ve mentioned in the past the power of denominators to radically change what we’re looking at, how averages might be lying to you, and how often people misreport “hours worked”…..but I don’t know that I’d ever put all 3 together.

When answering the question “how many hours do you work per day”, most full time workers generally name a number between 8 and 10 hours a day. Taken literally though, it occurred to me that the answer is really somewhere between 6 and 7 hours on average, as most people aren’t working on the weekends. So basically when asked “average hours per day” we assume “average hours per working day” and answer accordingly.

This is a minor thing, but it actually is part of why the actual “hours worked” numbers and the reported “hours worked” numbers don’t always add up. When people try to figure out how much the average American is working, they take things like vacation/holiday weeks in to account. The rest of us don’t tend to do that, and instead report on how much we worked when we worked full weeks. A slight shift in denominator, but it can end up shifting the results by a decent amount.

Not the most groundbreaking insight ever, but I always get a little interested when I realize we’ve all been assuming a denominator that’s not explicitly clarified.

I’m Thinking of a Word That Starts With a…..

I’ve mentioned before that I like to try to find unusual ways of teaching my 5 year old son statistical concepts by relating them to things he likes. This pretty much doesn’t work, but this week I tried it again and attempted to use a discussion about letters to segue in to a discussion about perception vs data. He’s getting in to some reading fundamentals now, and is incredibly curious about what words start with which letters. This leads to our new favorite game “Let’s talk about ____ words!” where we name a letter and then just think of as many words as we can that start with that letter.

This game is fun, but he’s a little annoyed at letters that make more than one sound. This week he got particularly irritated at the letter “c”, which he felt was hogging all the words while leaving “k” and “s” with none. I started trying to explain to him that “s” in particular was doing pretty alright for itself, but after discussing “cereal” and “circus” he was pretty convinced that “s” was in trouble.

As I was defending the English language’s treatment of the letter “s”, I started to wonder what the most common first letter of words actually was. I also wondered if it was different for “kids words” vs “all words”. After some poking around on the internet, I discovered that there’s a decent amount of variation depending on what word list you go with. I decided to take a look at three lists:

  1. All unique words appearing more than 100,000 times in all books on Google ngrams (Note: I had to go to the original file here. The list they provide on that site and the Wiki page is actually the most common first letters for all words used, not just unique words. That’s why “t” is the most common….it’s counting every instance of “the” separately)
  2. The 1,000 most commonly used English language words (of Up-Goer 5 fame)
  3. The Dolch sight words list, used to teach kids to read

Comparing the percent of words starting with each letter on each list got me this graph:

As I suspected, “s” does quite well for itself across the board, though it really shines in the “core words” list. “K” on the other hand is definitely being left out. It’s interesting to see what letters do well in bigger word sets (like c, p and m), and which ones are only in the smaller sets (b, t, o and w). “W” seems very popular for early reading lists because of words like “what”, “where”, “why”. “S” actually is really interesting, as it appears to kick off lots of common-but-not-basic words. My guess is this is because of its participation in letter combinations like “sh” and “sch”.

Anyway, my son didn’t really seem to grasp the “the plural of anecdote is not data” lesson, so I pointed out to him that both “Spiderman” and “superhero” started with “S”. At that point he agreed that yes, lots of words started with “s” and went back to feeling bad for “K”. At least that we can agree upon.

Now please enjoy my favorite Sesame Street alphabet song ever: ABCs of the Swamp

Cornell Food and Brand Lab: an Update

After mentioning the embattled Brian Wansink and the Cornell Food and Brand Lab in my post last week, a friend passed along the most recent update on this story. Interestingly it appears Buzzfeed is the news outlet doing the most reporting on this story as it continues to develop.

A quick review:
Brian Wansink is the head of the Cornell Food and Brand Lab, which publishes all sorts of interesting research about the psychology behind eating and how we process information on health. Even if you’ve never heard his name, you may have heard about his work….studies like “people eat more soup if their bowl refills so they don’t know how much they’ve eaten”  or “kids eat more vegetables when they have fun names” tend to be from his lab.

About a year ago, he published a blog post where he praised one of his grad students for taking a data set that didn’t really show much and turning it in to 5 publishable papers.  This turned in to an enormous scandal as many people quickly pointed out that a feat like that almost certainly involved lots of data tricks that would make the papers results very likely to be false.  As the scrutiny went up, things got worse as now people were pouring over his previous work.

Not only did this throw Wansink’s work in to question, but a lot of people (myself included) who had used his work in their work now had to figure out whether or not to retract or update what they had written. Ugh.

So where are we now?
Well as I mentioned, Buzzfeed has been making sure this doesn’t drop. In September, they reported that the aforementioned “veggie with fun names” study had a lot of problems. Worse yet, Wansink couldn’t produce the data when asked.   What was incredibly concerning is that this particular paper is part of a program Wansink was piloting for school lunches. With his work under scrutiny, over $20 million in research and training grants may have gone towards strategies that may not actually be effective. To be clear, the “fun veggie name study” wasn’t the only part of this program, but it’s definitely not encouraging to find out that parts of it are so shaky.

To make things even worse, they are now reporting that several of his papers that allegedly were done on three different topics in three different years sent to three different sample populations show the exact same number of survey respondents: 770. Those papers are being reviewed.

Finally, the report he has a 4th paper being retracted, this one on WWII veterans and cooking habits. An interview with the researcher who helped highlight the issues with the paper is up here at Retraction Watch, and some of the problems with the paper are pretty amazing. When asked where he first noted problems, he said: “First there is the claim that only 80% of people who saw heavy, repeated combat during WW2 were male.”  Yeah, that seems a little off. Wansink has responded to the Buzzfeed report to say that this was due to a spreadsheet error.

Overall, the implications of this are going to be felt for a while. While only 4 papers have been retracted so far, Buzzfeed reports that 8 more have planned corrections, and over 50 are being looked at. With such a prolific lab and results that are used in so many places, this story could go on for years. I appreciate the journalists keeping up on this story as it’s an incredibly important cautionary tale for members of the scientific community and the public alike.

Food Insecurity: A Semester in Review

I mentioned a few months back that I was working on my capstone project for my degree this semester. I’ve mostly finished it up (just adjusting some formatting), so I thought it would be a good moment to post on my project and some of my findings. Since I have to present this all in a week or two, it’s a good moment to gather my thoughts as well.

Background:
The American Time Use Survey is a national survey carried out by the Bureau of Labor Statistics that surveys Americans about how they spend their time. From 2014-2016 they administered a survey module that asked specifically about health status and behaviors. They make the questionnaire and data files publicly available here.

What interested me about this data set is that they asked specifically about food insecurity….i.e. “Which of the following statements best describes the amount of food eaten in your household in the last 30 days – enough food to eat, sometimes not enough to eat, or often not enough to eat?” Based on that data, I was able to compare those who were food secure (those who said “I had enough food to eat”) vs the food insecure (those who said they “sometimes” or “frequently” did not have enough to eat.

This is an interesting comparison to make, because there’s some evidence that in the US these two groups don’t always look like what you’d expect. Previous work has found that people who report they are food insecure actually tend to weigh more than those who are food secure. I broke my research down in to three categories:

  1. Confirmation of BMI differences
  2. Comparison of health habits between food secure and food insecure people
  3. Correlation of specific behaviors with BMI within the food insecure group

Here’s what I found:

Confirmation of BMI differences:
Yes, the paradox is true for this data set. Those who were “sometimes” or “frequently” food insecure were almost 2 BMI points heavier than those who were food secure…around 10-15 pounds for most height ranges. Level of food insecurity didn’t seem to matter, and the effect persisted even after controlling for public assistance and income.

Interestingly, my professor asked me if the BMI difference was due more to food insecure people being shorter (indicating a possible nutritional deficiency) or from being heavier, and it turns out it’s both. The food insecure group was about an inch shorter and 8 lbs heavier than the food secure group.

Differences in health behaviors or status:
Given my sample size (over 20,000), most of the questions they asked ended up having statistically significant differences. The ones that seemed to be both practically and statistically significant were:

  1. Health status People who were food insecure were WAY more likely to say they were in poor health. This isn’t terribly surprising since disability would impact people’s assessment of their health status and ability to work/earn a living.
  2. Shopping habits While most people from both groups did their grocery shopping at grocery stores, food insecure people were more likely to use other stores like “supercenters” (i.e. Walmart or Target) and convenience stores or “other” types of stores. Food secure people were more likely to use places like Costco or Sam’s Club. Unsurprisingly, people who were food insecure were much more likely to say they selected their stores based on the prices. My brother had asked specifically up front if “food deserts” were an issue, so I did note that the two groups answered “location” was a factor in their shopping at equal rates.
  3. Soda consumption Food insecure people were much more likely to have drank soda in the last 7 days (50% vs 38%) and much less likely to say it was a diet soda (40% vs 21.5%) than the food secure group.
  4. Exercise Food insecure people were much less likely to have exercised in the last 7 days (50.5%) than food secure people were (63.9%). Given the health status ranking, this doesn’t seem surprising.
  5. Food shopping/preparation Food insecure people were much more likely to be the primary food shopper and preparer. This makes sense when you consider that food insecurity is a self reported metric. If you’re the one looking at the bills, you’re probably more likely to feel insecure than if you’re not. Other researchers have noted that many food stamp recipients will also cut their own intake to make sure their children have enough food.

Yes, I have confidence intervals for all of these, but I’m sparing you.

BMI correlation within the food insecure group:
Taking just the group that said they were food insecure, I then took a look at which factors were most associated with higher BMIs. These were:

  1. Time spent eating Interestingly, increased time spent eating was actually associated with lower BMIs. This may indicate that people who can plan regular meal times might be healthier than those eating while doing other things (the survey asked about both).
  2. Drinking beverages other than water Those who regularly drank beverages other than water were heavier than those who didn’t
  3. Lack of exercise No shock here
  4. Poor health The worse the self assessed health, the higher the BMI. It’s hard to tease out the correlation/causation here. Are people in bad health due to an obesity related illness (like diabetes) or are they obese because they have an issue that makes it hard for them to move (like a back injury)? Regardless, this correlation was QUITE strong: people in “excellent” health had BMIs almost 5 points lower than those in “poor” health.
  5. Being the primary shopper I’m not clear on why this association exists, but primary shoppers were 2 BMI points heavier than those that shared shopping duties.
  6. Public assistance  Those who were food insecure AND received public assistance were heavier than those who were just food insecure.

It should be noted that I did nothing to establish causality here, everything reported is just an association. Additionally, it’s interesting to note a few things that didn’t show up here: fast food consumption, shopping locations and snacking all didn’t make much of a difference.

While none of this is definitive, I thought it was an interesting exploration in to the topic. I have like 30 pages of this stuff, so I can definitely clarify anything I didn’t go in to. Now to put my presentation together and be done with this!

 

Eating Season

Happy almost Thanksgiving! Please enjoy this bit of trivia I recently stumbled on about American food consumption patterns during this time of year! It’s from the book “Devoured: From Chicken Wings to Kale Smoothies – How What We Eat Defines Who We Are” by Sophie Egan.

From page 173:

A few paragraphs later, she goes a bit more in depth about what happens to shopping habits (note: she quotes the embattled Cornell Food and Brand lab, but since their data matches another groups data on this, I’m guessing it’s pretty solid):

I had no idea that “eating season” had gone so far outside the bounds of what I think of as the holiday season. Kinda makes you wonder if this is all just being driven by winter and the holidays are just an excuse.

On a related note, my capstone project is done/accepted with no edits and I will probably be putting up some highlights about my research in to food insecurity and health habits on Sunday.

Happy Thanksgiving!

5 Interesting Things About IQ Self-Estimates

After my post last week about what goes wrong when students self-report their grades, the Assistant Village Idiot left a comment wondering about how this would look if we changed the topic to IQ. He wondered specifically about Quora, a question asking/answering website that has managed to spawn its own meta-genre of questions asking “why is this website so obsessed with IQ?“.

Unsurprisingly, there is no particular research done on specific websites and IQ self-reporting, but there is actually some interesting literature on people’s ability to estimate their own IQ and that of those around them. Most of this research comes from a British researcher from the University College London, Adrian Fuhrman.  Studying how well people actually know themselves kinda sounds like a dream job to me, so kudos to you Adrian. Anyway, ready for the highlights?

  1. IQ self estimates are iffy at best One of the first things that surprised me about IQ self-estimates vs actual IQ was how weak the correlation was. One study found an r=.3, another r=.19.  This data was gathered from people who first took a test, then were asked to estimate their results prior to actually getting them. In both cases, it appears that people are sort of on the right track, but not terrific at pinpointing how smart they are. One wonders if this is part of the reason for the IQ test obsession….we’re rightfully insecure about our ability to figure this out on our own.
  2. There’s a gender difference in predictions Across cultures, men tend to rank their own IQ higher than women do, and both genders consistently rank their male relatives (fathers, grandfathers and sons) as smarter than their female relatives (mothers, grandmothers and daughters). This often gets reported as male hubris vs female humility (indeed, that’s the title of the paper), but I note they didn’t actually compare it to results. Given that many of these studies are conducted on psych undergrad volunteers, is it possible that men are more likely to self select when they know IQ will be measured? Some of these studies had average IQ guesses of 120 (for women) and 127 (for men)….that’s not even remotely an average group, and I’d caution against extrapolation.
  3. Education may be a confounding factor for how we assess others One of the other interesting findings in the “rate your family member” game is that people rank previous generations as half a standard deviation less intelligent than they rank themselves. This could be due to the Flynn effect, but the other suggestion is that it’s hard to rank IQ accurately when educational achievement is discordant. Within a cohort, education achievement is actually pretty strongly correlated with IQ, so re-calibrating for other generations could be tricky.  In other words, if you got a master’s degree and your grandmother only graduated high school, you may think your IQ is further apart than it really is. To somewhat support this theory, as time has progressed, the gap between self rankings and grandparent rankings has closed. Interesting to think how this could also effect some of the gender effects seen in #2, particularly for prior generations.
  4. Being smart may not be the same as avoiding stupidity One of the more interesting studies I read looked at the correlation between IQ self-report and personality traits, and found that some traits made your more likely to think you had a high IQ. One of these traits was stability, which confused me because you don’t normally think of stable people as being overly high on themselves. When I thought about it for a bit though, I wonder if stable people were defining being “smart” as “not doing stupid things”.  Given that many stupid actions are probably more highly correlated with impulsiveness (as opposed to low IQ), this could explain the difference. I don’t have proof, but I suspect a stable person A with an IQ of 115 will mostly do better than an unstable person B with an IQ of 115, but person A may attribute this difference to intelligence rather than impulse control. It’s an academic distinction more than a practical one, but it could be confusing things a bit.
  5. Disagreeableness is associate with higher IQs, and self-perception of higher IQs  Here’s an interesting chicken and egg question for you: does having a high IQ make you more disagreeable or does being disagreeable make you think you have a higher IQ? Alternative explanation: is some underlying factor driving both? It turns out having a high IQ is associate both with being disagreeable and being disagreeable is associated with ranking your IQ as higher than others. This probably effects some of the IQ discussions to a certain degree….the “here’s my high IQ now let’s talk about it” crowd probably really is not as agreeable as those who want to talk about sports or exchange recipes.

So there you have it! My overall impression from reading this is that IQ is one of those things where people don’t appreciate or want to acknowledge small differences. In looking at some of the studies of where people ranking their parents against each other, I was surprised how many were pointing to a 15 point gap between parents, or a 10 point gap between siblings. Additionally, it’s interesting that we appear to have a pretty uneasy relationship with IQ tests in general. Women in the US for example are more likely to take IQ tests than men are but less likely to trust their validity. To confuse things further, they are also more likely to believe they are useful in educational settings. Huh? I’d be interested to see a self-estimated IQ compared to an actual understanding of what IQ is/is not, and then compare that to an actual scored IQ test. That might flesh out where some of these conflicting feelings were coming from.

Grade Prediction by Subject

I saw an interesting study this week that seems to play in to two different topics I’ve talked about here: self reporting bias and the Dunning-Kruger effect.

The study was “Examining the accuracy of students’ self-reported academic grades from a correlational and a discrepancy perspective: Evidence from a longitudinal study“, and it took a look at how accurate students self-reported grades were. This is not the first time someone has looked at this, but it did add two key things to the mix: non-US test scoring and different academic subjects over different years of school. The students surveyed were Swiss, and they were asked to report their most recent grade in 4 different subjects. This was then compared to their actual most recent grade. The results were pretty interesting (UR=under-report, OR=Over-report, T1-T3 are years of school):

Unsurprisingly, kids were much more likely to over-report than under-report. Since most of the differences were adding a half point or so (out of 6), one wonders if this is just a tendency to round up in our own favor. Interestingly, a huge majority of kids were actually quite honest about their ability….about 70% for most years. The authors also noted that the younger kids were more likely to be honest than the older kids.

I think this is a really interesting example of how self-reporting biases can play out. It’s easy to think of bias as something that’s big and overwhelming, but studies like this suggest that most bias is small for any given individual. A rounding error here, and accidental report of your grade from last semester….those are tiny for each person but can add up over a group. I suspect if we looked at those older students who reported their grades as inaccurately high, we would discover that they had gotten high grades in previous years. There does seem to be a bias towards reporting your high water mark rather than your current status….kinda like the jock who continues to claim they can run a 5 minute mile long after they cease to be able to do so.

The phenomena is pretty well known, but it’s always interesting to see the hard numbers.

Millenials and Communism

I was perusing Twitter this past weekend when I started to see some concerning headlines float by.

Survey: 1 in 2 millennials would rather live in a socialist or communist country than capitalist one

Millenials think socialism would make a great safe space

Nearly 1 In 5 Millennials Consider Joseph Stalin And Kim Jong Un ‘Heroes’

While I could see a survey of young people turning up with the socialism result, that last headline really concerned me. At first I thought it was just a case of “don’t just read the headline“, but all the articles seemed to confirm the initial statistic. AOL said “a lot of them see Joseph Stalin and Kim Jong Un as “heroes.”” Fox News hit on my discomfort when they said “The report also found that one in five Americans in their 20s consider former Soviet dictator Joseph Stalin a hero, despite his genocide of Ukrainians and Orthodox priests. Over a quarter of millennials polled also thought the same for Vladimir Lenin and Kim Jong Un.”

Seriously?

While I know polls frequently grab headlines by playing on people’s political ignorance, this seemed to go a step beyond that. I had trouble wrapping my head around the idea that anyone in the US could list Stalin, Lenin or Jong-Un as a hero, let alone 20-25%. I had to go see what question prompted such an odd set of results.

The overview of the poll results is here, and sure enough, the question that led to the results is worded a little differently than the article. Here’s the screenshot from the report, blue underlines/boxes are mine:

I think the “hero for their country” part is key. That asks people to assess not just their own feelings, but what they know about the feelings of a whole other country.

Interestingly, I decided to look up Kim Jong-un’s in-country approval rating, and some defectors put it as high as 50%.  According to one poll, 38% of Russians consider Josef Stalin to be the “most outstanding person” in world history. You could certainly debate if those polls had problems in wording, sample or other methodology, but the idea that a 25 year old in the US might see a headline like that and conclude that Russians really did like Stalin doesn’t seem outside the realm of possibility. Indeed, further down the report we find out that only 6% of millenials in the US state that they personally have a favorable view of Stalin. That’s lizard people territory folks.

In this case, it appears the polling company was actually pretty responsible about how they reported things, so it’s disappointing that further reports dropped the “in their country” piece. In my ongoing quest to name different biases and weird ways of skewing data, I’m now wondering what to name this one. What do you call it when someone asks a poll question in a way that encompasses a variety of scenarios, then the later reports shorten the question to make it sound like a different question was answered? I’m gonna work on this.

Daylight Saving (is not the worst of evils)

Well hi there! At this point on Sunday, I’m going to assume you’ve remembered that your clock should have been set back last night. With the advent of cell phones and auto-updates, I suspect the incidence of “showing up to church an hour early because no one remembered daylight saving time” has dropped precipitously since I was a kid.

Growing up, daylight saving time was always the subject of some debate in my house. My dad is a daylight saving time defender, and takes a lot of joy in pointing out to people that no matter how irritated you are by the time change, not changing the time would be even more annoying.

To support his point, I found this site that someone posted on Facebook rather interesting. It’s by a cartographer, and it lets you see the impact of Daylight Saving on the different regions of the country. It also lets you monkey around with different schemes….eliminate daylight saving vs impose it permanently vs keep the status quo…and see what impact they’d have on the sunrise/sunset times. (Note: he created it in 2015, so some numbers may not reflect the 2017 time changes)

My Dad’s point was always that daylight saving blunts the extremes, so I tried out a few different schemes to see how often they made the sunrise very early vs very late. For example, here’s how many days the sun would rise before 5am in different regions if we keep things status quo vs eliminate daylight saving vs always use it:

If you go to the actual website and hover, you can get the exact number of days those colors represent. If we did away with daylight saving, my region of the country would have over 120 days of pre-5am sunrises. I’m an early riser, but that seems a little much even for me.

Here’s how it would effect post-8pm sunsets:

So basically my Dad was right. If you want lots of early sunrises, push to abolish daylight saving. I think most people sort of know that’s what the time change thing is all about, but it is interesting to see exactly how many early sunrises we’re talking about. When you consider that the sky starts to lighten half an hour before sunrise, you realize that getting rid of daylight saving is signing yourself up for a LOT of early morning sunshine.

I think the real PR problem here is that the time changes happen so far away from the extremes that people forget that it’s really designed to help mitigate situations that would occur several months later. I think there’s a new bias name in here somewhere.

Probability, Don’t You Mess With Me

In honor of Halloween, please enjoy the only stats based/horror 80s music video I know of:

 

From 3-2-1 Contact, one of the more formative shows of my childhood.