So Why ARE Most Published Research Findings False? (An Introduction)

Well hello hello! I’m just getting back from a conference in Minneapolis and I’m completely exhausted, but I wanted to take a moment to introduce a new Sunday series I’ll be rolling out starting next week. I’m calling it my “Important Papers” series, and it’s going to be my attempt to cover/summarize/explain the important points and findings in some, well, important papers.

I’m going to start with the 2005 John Ioannidis paper “Why Most Published Research Findings are False“.  Most people who have ever questioned academic findings have heard of this one, but fewer seem familiar with what it actually says or recommends. Given the impact this paper has had, I think it’s a vital one for people to understand.  I got this idea when my professor for this semester made us all read it to kick off our class, and I was thinking how helpful it was to use that as a framework for further learning. It will probably take me 6 weeks or so to get through the whole thing, and I figured this week would be a good time to do a bit of background. Ready? Okay!

John Ioannidis is Greek physician who works at Stanford University. In 2005 he published the paper “Why Most Published Research Findings Are False”. This quickly became the most cited paper from PLOS Medicine, and is apparently one of the most accessed papers of all time with 1.5 million downloads. The paper is really the godfather of the meta-research movement…i.e. the push to research how research goes wrong. The Atlantic did a pretty cool breakdown of Ioannidis’s career and work here.

The paper has a few different sections, and I’ll going through each of them. I’ll probably group a few together based on length, but I’m not sure quite yet how that will look.  However, up front I’m thinking the series will go like this:

  1. The statistical framework for false positive findings
  2. Bias and failed attempts at corrections
  3. Corollaries (aka uncomfortable truths)
  4. Research and Bias
  5. A Way Forward
  6. Some other voices/complaints

I’ll be updating that list with links as I write them.

We’ll kick off next week with that first one. There will be pictures.

Week one is up! Go straight to it here.

 

5(ish) Posts About Elections, Bias, and Numbers in Politics

It’s election day here in the US, so I thought I’d do a roundup of my favorite posts I’ve done in the past year about the political process and it’s various statistical pitfalls. Regular readers will recognize most of these, but I figured there were worth a repost before they stopped being relevant for another few years.  As always, these posts are meta/about the process type posts, and no candidates or positions are endorsed. The rest of you seem to have that covered quite nicely.

  1. How Do They Call Elections So Early? My most popular post so far this year, I walk through the statistical methods used to call elections before all the votes are counted. No idea if this will come in to play today, but if it does you’ll be TOTALLY prepared to explain this at your next cocktail party or whatever it is the kids do these days.
  2. 5 Studies About Politics and Bias to Get You Through Election Season In this post I do a roundup of my favorite studies on, well, politics and bias. Helpful if you want to figure out what your opponents are doing wrong, but even MORE helpful if you use it to re-examine some of your own beliefs.
  3. Two gendered voting studies. People love to study the secret forces driving individual genders to vote certain ways, but are those studies valid? I examined one study that attempted to link women’s voting patterns and menstrual cycles here, and one that attempted to link threats to men’s masculinity and their voting patterns here. Spoiler alert: I was underwhelmed by both.
  4. Two new logical fallacies (that I just made up) Not specific to politics, but aimed in that direction. I invented the Tim Tebow Fallacy for those situations when someone defends a majority opinion as though they were an oppressed minority. The Forrest Gump Fallacy I made up for those times when someone believes that their own personal life is actually reflective of a greater trend in America….when it doesn’t.
  5. My grandfather making fun of statistical illiteracy of political pundits 40 years ago. The original stats blogger in my family also got irritated by this stuff. Who would have thought.

As a final thought, if you’re in the US, go vote! No, it won’t make a statistically significant difference on the national, but I think there’s a benefit to being part of the process.

What Can Your Dentist Tell You About Your Cancer Risk?

Welcome to “From the Archives”, where I dig up old posts and see what’s changed in the years since I originally wrote them.

From time to time something fun reminds me of an old post of mine and I get all excited to go back and research what’s changed since I originally wrote them.

This is not one of those times.

A past post popped in to my head last week, but not for a good reason. A childhood friend of mine was diagnosed with ovarian cancer recently, which is a bit of a shock since she’s only 35, and hits close to home since she has a daughter just a bit younger than my son. Working at a cancer hospital I am unfortunately used to seeing early and unfair diagnoses, but it still has an extra sting when it’s someone you know and when they’re in the same phase of life you are. This friend actually has an interesting intersection with this blog, as she’s a science teacher  whose class I’ve visited and given a version of my Intro to Internet Science talk to. She does great work with those kids, and I loved meeting her class. If you’re the prayers/good thoughts type, send some her way.

Not the happiest of introductions, but the whole experience did remind me about how important it is for people to know the signs of ovarian cancer, as it can be easily missed. Additionally, it made me think of my 2013 post “What Can Your Dentist Tell You About Your Risk For Ovarian Cancer?” where I blogged about the link between congenitally missing teeth and ovarian cancer. I wondered if there had been any updates since then, and it looks like there are! Both scientifically and with a couple dozen spammers who left comments on my original post. Cosmetic dentistry folks apparently have a lot of bots working for them. Anyway, let’s take a look! At the science, not the spammers that is.

First, some background: For those of you who didn’t read the original post, it covered a study that found that women who have ovarian cancer are 8 times more likely to have congenitally missing teeth than women who don’t have ovarian cancer. Since I have quite a few congenitally (ie born that way not knocked out or pulled) missing teeth (both mandibular second molars and both mandibular second bicuspids), I was pretty interested in this fact. I used it as a good example of a correlation/causation issue, because there is likely a hidden third variable (like a gene mutation) causing both the missing teeth and the cancer as opposed to one of those two things causing the other one.

So why missing teeth? Well, first, because it’s kind of fascinating to think of tooth abnormalities being linked to your cancer risk. Dental medicine tends to be pretty separate from other types of medicine, so exploring possible overlaps feels pretty novel. When someone has teeth that fail to develop (also known as hypodontia or angenesis), it’s thought to be a sign of either an early developmental interruption or a gene mutation. Missing teeth are an intriguing disease marker because they are normally spotted early and conclusively. Knowing up front that you are at a higher risk for certain types of cancer could help guide screening guidelines for years.

So what’s the deal with the ovarian cancer link? Well, it’s been noted for a while that women are more likely to have hypodontia then men. Since hypodontia is likely caused by some sort of genetic mutation or disruption in development, it made a certain amount of sense to see if it was linked with cancer specific to women. The initial study linking missing teeth and ovarian cancer showed women with ovarian cancer were 8 times as likely to have missing teeth, but subsequent studies were less certain.  A 2016 meta-analysis showed that overall it appears about 20% of ovarian cancer patients have evidence of hypodontia, as opposed to the general population rate of 2-11%. Unfortunately there’s still not a definitive biological mechanism (ie a gene that clearly drives both), and there’s not enough data to say how predictive missing teeth are (ie what my risk as a healthy person with known hypodontia is). We also don’t know if more missing teeth means greater risk, or if it’s only certain teeth that prove the risk. So while we’re part way there, we’re missing a few steps in the proving causality chain.

Are there links to other cancers here too? Why yes! This paper from 2013 reviewed the literature and discovered that all craniofacial abnormalities (congenitally missing teeth, cleft palate, etc) seem to be associated with a higher family cancer risk.  That paper actually interviewed people about all their family members cancer histories, to cast a wider net for genetic mutations. Interestingly, the sex-linked cancers (prostate, breast, cervical and ovarian) were significantly associated with missing teeth, as was brain cancer. In some families it looks like there is a link to colorectal cancer, but this doesn’t appear to be broadly true.

So where does this leave us? While the evidence isn’t yet completely clear, it does appear that people who are missing teeth should be on a slightly higher alert for signs of ovarian or prostate cancer. Additionally, I’ve sent my dentist and my PCP the literature to review, since neither of them had ever heard of this link. Both found it noteworthy. It’s probably not worth losing sleep over, since we don’t know what the absolute increase is at this point. However, it’s good to keep in the back of your mind. Early detection saves lives.

3 More Examples of Self Reporting Bias

Right after I put up my self reporting bias post last week, I saw a few more examples that were too good not to share. Some came from commenters, some were random stories I came across, but all of them could have made the original list. Here you go:

  1. Luxury good ratings Commenter Uncle Bill brought this one up in the comments section on the last post, and I liked it. The sunk cost fallacy  says that we have a hard time abandoning money we’ve already spent, and this kicks in when we have to say how satisfied we are with our luxury goods. No one wants to admit a $90,000 vehicle actually kind of sucks, so it can be hard to figure out if the self reported reliability ratings reflect reality or a desired reality.
  2. Study time Right after I put my last self reporting bias post up, this study came across my Twitter feed. It was a study looking in to “time spent on homework” vs grades, and initially it found that there was no correlation between the two. However, the researchers had given the college students involved pens that actually tracked what they were doing so they double checked the students reports. With the pen-measured data, there actually was a correlation between time on homework and performance in the class. It turned out that many of the low performing kids wildly overestimated how much time they were actually spending on their homework, much more so than the high performing kids. This bias is quite possibly completely unintentional….kids who were having a tough time with the material probably felt like they were spending more time than they were.
  3. Voter preference I mentioned voter preference in my Forest Gump Fallacy post, and I wanted to specifically call out Independent voters here. Despite the name and the large number of those who self identify as such, when you look at voting patterns many independent voters are actually what they call “closet partisans”. Apparently someone who identifies as Independent but has a history of voting Democrat is actually less likely to ever vote GOP than someone who identifies as a “weak Democrat”.  So Independent is a tricky group of Republicans who don’t want to say they’re Republicans, Democrats who don’t want to say they’re Democrats, 3rd party voters, voters who don’t care, and voters who truly have no party affiliation. I’m sure I left someone out, but you can see where it gets messy. This actually also effects how we view Republicans and Democrats, as those groups are normally polled based on self identification. By removing the Independents, it can make one or both parties look like their views are changing, even if the only change is who checked the box on the form.

If you see any more good ones, feel free to send them my way!

The Forrest Gump Fallacy

Back in July, I took my first crack at making up my own logical fallacy. I enjoyed the process, so today I’m going to try it again. With election season hanging over us, I’ve seen a lot of Facebook-status-turned-thinkpieces, and I’ve seen this fallacy pop up more and more frequently. I’m calling it “The Forrest Gump Fallacy”. Yup, like this guy:

For those of you not prone to watching movies or too young to have watched this one, here’s some background: Forrest Gump is a movie from 1994 about a slow-witted but lovable character who manages to get involved in a huge number of political and culturally defining moments over the course of his life from 1944 to 1982. Over the course of the film he meets almost every US president for that time period, causes Watergate, serves in Vietnam and speaks at anti-war rallies, and starts the smiley face craze.  It has heaps of nostalgia and an awesome soundtrack.

So how does this relate to Facebook and politics? Well, as I’ve been watching people attempt to explain their own political leanings recently, I’ve been noticing that many of them seem to assume that the trajectory of their own life and beliefs mirrors the trajectory of the country as a whole. To put it more technically:

Forrest Gump Fallacy: the belief that your own personal cultural and political development and experiences are generalizable to the country as a whole.

There are a lot of subsets of this obviously….particularly things like “this debate around this issue didn’t start until I was old enough to understand it” and “my immediate surroundings are nationally representative”. Fundamentally this is sort of a hasty generalization fallacy, where you draw conclusions from a very limited sample size. Want an example? Okay, let me throw myself under the bus.

If you had asked me a few years ago to describe how conservative vs liberal the US was in various decades that I’d lived through, I probably would have told you the following: the 1980s were pretty conservative, the 1990s also had a strong conservative influence, mostly pushing back against Clinton. Things really liberalized more around the year 2000, when people started pushing back against George W Bush. I was pretty sure this was true, and I was also not particularly right. Here is party affiliation data from that time:

Republican affiliation actually dropped during the 90s and rose again after 2000. Now, I could make some arguments about underdogs and the strength of cultural pushback, but here’s what really happened: I went to a conservative private Baptist school up through 1999, then went to a large secular university for college in the early 2000s. The country didn’t  liberalize in the year 2000, my surroundings did.  This change wasn’t horribly profound, after all engineering profs are not particularly known for their liberalism, but it still shifted the needle. I could come up with all the justifications in the world for my biased knee jerk reaction, but I’d just be self justifying. In superimposing the change in my surroundings and personal development over the US as a whole, I committed the Forrest Gump Fallacy.

So why did I do this? Why do others do this? I think there’s a few reasons:

  1. We really are affected by the events that surround us Most fallacies start with a grain of truth, and this one does too. In many ways, we are affected by watching the events that surround us, and we do really observe the country change around us. For example, most people can quite accurately describe how their own feelings and the feelings of the country changed after September 11th, 2001. I don’t think this fallacy arises around big events, but rather when we’re discussing subtle shifts on more divisive issues.
  2. Good cultural metrics are hard to come by A few paragraphs ago, I used party affiliation as a proxy for “how liberal” or “how conservative” the country was during certain decades. While I don’t think that metric is half bad, it’s not perfect. Specifically, it tells us very little about what’s going on with that “independent” group…and they tend to have the largest numbers. Additionally, it’s totally possible that the meaning of “conservative” or “liberal” will change over time and on certain issues. Positions on social issues don’t always move in lock step with positions on fiscal issues and vice versa. Liberalizing on one social issue doesn’t mean you liberalize on all of them either. In my lifetime, many people have changed their opinion on gay marriage but not on abortion. When it’s complicated to get a good picture of public opinion, we rely on our own perceptions more heavily. This sets us up for bias.
  3. Opinions are not evenly spread around This is perhaps the biggest driver of this fallacy, and it’s no one’s fault really. As divided as things can get, the specifics of the divisions can vary widely in your personal life, your city and your state. While the New Hampshire I grew up in generally leaned conservative, it was still a swing state. My school however was strongly conservative and almost everyone was a Republican, and certainly almost all of the staff. Even with only 25% of people identifying themselves as Republican there are certainly many places where someone could be the only Democrat and vice versa. Ann Althouse (a law professor blogger who voted for Obama in 2008) frequently notes that her law professor colleagues consider her “the conservative faculty member”. She’s not conservative compared to the rest of the country, but compared to her coworkers she very much is. If you don’t keep a good handle on the influence of your environment, you could walk away with a pretty confused perception of “normal”.

So what do we do about something like this? I’m not really sure. The obvious answer is to try to mix with people who don’t think like you, aren’t your age and have a different perspective from you, but that’s easier said than done. There’s some evidence that conservatives and liberals legitimately enjoy living in different types of places and that the polarization of our daily lives is getting worse. Sad news. On the other hand, the internet does make it easier than ever to seek out opinions different from your own and to get feedback on what you might be missing. Will any of it help? Not sure. That’s why I’m sticking with just giving it a name.

5 Things About the Doomsday Algorithm

I mentioned last week that I’m currently reading a biography of John Conway, and I came across something interesting during the discussion of his version of the Doomsday Algorithm. Otherwise known as the “perpetual calendar” problem, it’s a method for mentally calculating what day of the week any given date fell on. Conway was so obsessed with this problem and improving his time for the mental math that he had his computer set up to make him solve ten of these before he got in. Supposedly his record was 10 dates in 15 seconds. #lifegoals. Anyway, this whole discussion got me poking around about this mental math trick, and I wanted to share a few things that I found:

  1. Lewis Carroll published on this problem Yeah, the guy who wrote Alice in Wonderland also came up with a perpetual calendar algorithm, and it was published in Nature in 1887.
  2. By “Doomsday” we mean “anchor day” John Conway has an excellent flare for the dramatic, and the title of this algorithm proves it. However, it’s a misleading title for what’s really going on. Basically, Conway realized that a whole bunch of easy to remember days (4/4, 6/6, 8/8, 10/10 and 12/12) all fall on the same day of the week in any given year. If you can figure out what day that was, you get an “anchor day” in those months. From there, he realized that 5/9, 9/5, 7/11 and 11/7 all fall on the same day as well, so you now have one known date in each month. As you can see, this simplifies further calculations considerably.
  3. Do you bite your thumb at us sir? Conway does. One of his tricks for remembering his full trick is to use his fingers as prompts and bite his thumb to remember the number he got there. This link also has some very helpful videos of Conway explaining his method.
  4. Others have improved on the method The gamesmanship of this method has been inspiring to a lot of mathy folks, and some of them continue to try to find simpler/better/faster ways for people to calculate the day of the week. This method looks like the current favorite for simplicity, and is the one I think I’m going to start with.
  5. Don’t try to calculate anything from 1752 At least if you’re in the US or England, this is a trap. September 2nd-Sept 14th of that year don’t exist. Now there’s a trivia question for you.

 

5 Interesting Examples of Self Reporting Bias

News flash! People lie. Some more than others. Now there are all sorts of reasons why we get upset when people don’t tell the truth, but I’m not here to talk about those today. No, today I’m here to give a few interesting examples of where self-reporting bias can really kinda screw up research and how we perceive the world.

Now, self reporting bias can happen for all sorts of reasons, and not all of them are terrible. Some bias happens because people want to make themselves look better, some happens because people really think they do things differently than they do, some happens because people just don’t remember things well and try to fill in gaps. Regardless of the reason, here’s 5 places bias may pop up:

  1. Nutrition/Food Intake Self reported nutrition data may be the worst example of research skewed by self reporting. For most nutrition/intake surveys, about 67% of respondents give implausibly low answers….an effect that actually shows up cross culturally. Interestingly there are some methods known to improve this (doubly labeled water for example), but they tend to be more expensive and thus are used less often. Unfortunately this effect isn’t random, so it’s hard to know exactly how bad they effect is across the board.
  2. Height While it’s pretty ubiquitous that people lie about their weight, lying about height is a less recognized but still interesting problem. It’s pervasive in online dating for both men AND women, both of whom exaggerate by about 2 inches. On medical/research surveys we all get slightly more honest, with men overestimating their height by about .5 inches, and women by .33 inches.
  3. Work hours Know anyone who says they work a 70 hour week? Do they do this regularly? Yeah, they’re probably not remembering that correctly.  Edit: My snark got ahead of me here, and I got called out in the comments, so I’m taking it back. I also added some text in bold to clarify what the problem is. When people are asked how much they work per week, they tend to give much higher answers than when they are asked to list out the hours they worked during the week. The more they say they work, the more likely to have inflated the number. People who say they work 75+ hours work an average of 50 hours/week, and  those who say they work 40 hours/week tend to work about 37. Added: While some professions do actually require crazy hours (especially early in your career….looking at you medical residencies, and first year teachers are notorious for never going home), very few keep this up forever. Additionally, what people work most weeks almost never equals what they work when averaged over the course of a year. That 40 hour a week office worker almost certainly gets some vacation time, and even 2 weeks of vacation and a few paid holiday take that yearly average down to 37 hours per week…and that’s before you add in sick time.  Some of this probably gets confusing because of business travel or other “grey areas” like professional development time, but it also speaks to our tendency to remember our worst weeks better than our good ones.
  4. Childhood memories It is not uncommon in psychological/developmental research that adults will be asked various questions about the state of their life currently while also being queried about their upbringing. This typically leads to conclusions about parenting type x leading to outcome y in children. I was recently reading a paper about various discipline methods and long term outcomes in kids, when I ran across a possible confounder I hadn’t considered: sex differences in the recollection of childhood memories. Apparently overall men are not as good at identifying family dynamics from their childhoods, and the authors wondered if that led to some false findings. They didn’t have direct evidence, but it’s an interesting thing to keep in mind.
  5. Base 10 madness You wouldn’t think our fingers would cause a reporting bias, but they probably do. Our obsession with doing things in multiples of 5 or 10 probably comes from our use of our hands for counting. When it comes to surveys and self reports, this leads to a phenomena called “heaping”, where people tend to round their reports to multiples of 5 and 10.  There’s some interesting math you can use to try to correct for this, but given that rounding tends to be non-constant (ie we round smaller numbers to 5 and larger numbers to 10) this can actually affect some research results.

Base 10 aside: one of the more interesting math/pop-culture videos I’ve seen is this one, where they explore why the Simpson’s (who have 4 fingers on each hand) still use base 10 counting (7:45 mark):

 

The Cynical Cartoonist Correlation Factor

I love a good creative metric:

dilbertplan

From the book “Results Without Authority” by Tom Kendrick.

In case you’re curious, this hangs on the wall behind my desk at work:

Happy Friday everyone!

What I’m Reading: October 2016

My stats book for the month is “Statistics Done Wrong“, which honestly I haven’t actually started yet. I got sidetracked in part by a different math related book “Genius at Play: the Curious Mind of John Horton Conway”  He’s a pretty amazing (still living!) mathematician, and the book about him is pretty entertaining. If you’ve never seen a simulation of his most famous invention  “Game of Life”, check it out here. Deceptively simple yet endlessly fascinating.

Moving on, this Atlantic article about why for profit education fails was really interesting. Key point: education does best when it’s targeted to local conditions, which means it actually becomes less efficient when you scale it up.

This list of the “7 deadly sins of research” was also delightful. It specifically mentions basic math errors, which is good, because those are happening a really concerning amount of the time.

Related to deadly sins, Andrew Gelman gives his history of the replication crisis.

Related to the replication crisis, holy hell China, get it together.

More replication/data error issues, but this time with a legal angle. Crossfit apparently is suing a researcher and journal who 1. worked for a competitor 2. published data that made Crossfit look bad that they later clarified was incorrect 3. had evidence that the journal/reviewers implied that they wouldn’t publish the paper unless it made Crossfit look bad. The judge has only ruled that this can proceed to trial, but it’s an interesting case to watch.

This paper on gender differences in math scores among highly gifted students was pretty interesting. It takes a look at the gender ratios for different SAT (and ACT) scores over the years (for 7th graders in the Duke Gifted and Talented program) and the trends are interesting. For the highest scorers in math (>700), it went from extremely male dominated (13:1 in 1981) to “just” very male dominated (4:1 by 1991) and then just stayed there. Seriously, that ratio hasn’t gone lower than 3.55 to 1 in the 25 years since. Here’s the graph:

mathbygender

In case you’re curious, top verbal scores are closer to 1:1. Curious what the recruitment practices are for the Duke program.

Also, some old data about Lyme Disease resurfaces, and apparently there may be a second cause? An interesting look at the “Swiss Agent” and why it got ignored.

 

Vanity Sizing: The Visual Edition

I’m swamped with homework this week, but after my post about vanity sizing a few weeks ago, I thought this picture might amuse a few people: fullsizerender-1

The white-ish sparkly dress on the top is one my grandmother gave me when I was a little kid to play “princess dress up” in (it was a floor length gown when I was 5!). Someone set it aside for me after she died, and I found it this week while sorting through some boxes. I checked the tag out, and it’s marked as a size 14. The dress below it is a bridesmaids dress I wore about 6 years ago at my brothers wedding….also marked a size 14. I don’t know how long my grandmother had the top dress when she gave it to me in the 80s, but my guess is it’s from the late 70s.  That’s 4 decades of size inflation right there folks.

If it followed this chart at all, the top dress would be a size 4 by today’s standards. The bottom dress would have been a size 20 in the late 70s.

My own vanity now compels me to mention that I don’t actually fit in the 2010 size 14 any more, I’m a 1987 size 14 thank-you-very-much.