The Carlisle Method: One Year Later

Every once in a while the traffic on an old post goes through the roof, and I realize something interesting must have happened. This week it was my Carlisle Method post, which I appreciated because I had been meaning to check back in on that whole situation.

For those of you who didn’t read the initial post, here’s the overview:
An anesthesiologist came up with a neat statistical way of checking for data fraud. Rather than focus on the flashy results and conclusions of papers, he focused on the characteristics of the test group vs control group and looked for evidence that they weren’t random. The theory is that people committing fraud would mostly focus on their conclusions, but might get sloppy when reporting things like subject age and such.  He publicly named the papers he thought were most questionable, and investigations were launched.

A year later, some of the results are in. The New England Journal of Medicine announced their results, and they actually are a bit encouraging. Out of the 11 papers they analyzed, 5 had mislabeled “standard error” as “standard deviation” so the Carlisle Method was picking up on the error. 5 papers were reviewed and found to be okay, and they suspect some limitations in the Carlisle Method itself had wrongly flagged them.

One paper however, is getting lots of attention. It turns out that a major paper on heart health and the Mediterranean Diet wasn’t quite done as reported. While it had been reported as a randomized trial, a bit of digging showed that about 20% of its participants  were actually “cluster randomized”.  The authors reanalyzed the data using the remaining 80% of people, and noted that this didn’t change the conclusions, but it did make them less statistically compelling. The paper was retracted and replaced with a new version that correctly covers the methods.

Now all of these errors may appear to be small, but they do raise a bit of a cause for concern. There is an assumption that all statistics will be checked thoroughly before papers go to press, but I think this highlights that sometimes errors get through. Particularly if a paper has a big splashy finding, it’s possible that errors will not be reviewed. The NEJM shared this concern, and is implementing a statistics course for their editors and giving extra scrutiny to published papers.

More concerning than the NEJM’s findings however, is that the other 7 journals who had papers on the list haven’t completed their investigations yet, said there were no problems with any of the papers, or didn’t respond. It seems a little unlikely that only the most prestigious journal involved had any errors, so it’s unfortunate the other journals aren’t doing more.

More on the varied reactions to the Mediterranean diet study here and here.

Off to Alaska

I’m off to Alaska shortly, to see my sister get married. I’ve never been before, and I believe this trip puts me up to 31 states visited. Here’s my map:

(Map drawing here)

Road trips from Massachusetts to Georgia and one from San Diego to Seattle got me the coasts, and a surprising amount of my midwest experience is from various conferences. Alabama, West Virginia and Alaska are the states I will have visited solely to watch someone get married, though I got to/will get to take a good look around each before leaving. With any luck I’ll add Nebraska to my list in September.

Amusingly, my 5 year old son has never been outside of New England, and Alaska will be his first experience with the rest of the US.

I looked it up, and apparently the average American adult has been to 12 states, with Florida, California, Georgia, New York and Nevada being the most visited. The map of visitation is here:

In many ways my map reflects the average, though my time in New England snagged me all the rarer states there.

Overall I’ve loved every new state I’ve been to, and I expect to enjoy Juneau quite a bit. The plane ride with a 5 year old may be tough, but the end result should be awesome. Wish us luck!

Vitamin D Deficiency: A Supplement Story

I think I mentioned a little while back that I hadn’t been feeling so great and was going to slow my posting down for a bit. A decent amount of that “not feeling great” thing was related to a rather alarming Vitamin D deficiency I had unfortunately developed, and have since been treated for. This involved taking prescription strength megadoses, which helped almost instantaneously. It was lovely. As my doctor said “this is the best possible outcome. You came in with all sorts of symptoms and one simple thing fixed it.” I fully agreed.

It was an interesting thing to have happen because a few years ago another family member of mine had asked their doctor for a Vitamin D test and gotten an eye roll and a “that’s not really a thing we do any more”. Googling a bit, it seemed like there were nearly as many articles talking about how you shouldn’t take Vitamin D as those advising that you should. In classic “vitamins as fads” fashion, I noted that most of the pro-Vitamin D articles were from around 2010, and the anti-ones were much more recent. Combining my own experience with that of Dr Google, it got me thinking about how trends in supplements (or other medications or health behaviors for that matter) get going and why people then turn on them.

Step 1: Something that is under-recognized makes people feel terrible.

According to the American Association of Family Physicians, Vitamin D deficiency can result in “Common manifestations of vitamin D deficiency are symmetric low back pain, proximal muscle weakness, muscle aches, and throbbing bone pain elicited with pressure over the sternum or tibia.” Me? I had all of those symptoms. It sucked. I couldn’t sleep. Doing any activity left me sore and tired.

Now since Vitamin D deficiency is pretty well known, my doctor tested for it right away along with several other things. However, if it was not a well recognized deficiency and she had to fumble around a bit before she got there, I could have been living like that for months or years.

Step 2: People who feel terrible feel better, and are excited about their miracle cure.

Now I am pretty darn excited about my turn around on Vitamin D, as I think anyone would be. Again though, Vitamin D is a pretty well known deficiency, and lots of people I talk to have had the same experience or know someone who has. Thus, my compulsion to “evangelize” this solution is limited. A bad thing happened, the medical establishment addressed it immediately, and I am a happy camper. No real story there.

However, if I’d been feeling that way for months and my doctor had overlooked it, I’d want EVERYONE to know what happened to prevent the suffering of others.

Step 3: People who hear about this start to wonder if it’s their issue too.

Lots of people feel fatigued, or have aches and pains. The number who might hear my story and wonder if this Vitamin D deficiency were causing their might be much wider than the circle who would go in to their doctors and complain of the same thing. This isn’t bad, but it does mean that some of those people are going to have much milder symptoms than the ones I experienced, and those could be something else.

For example, what finally drove me to my doctor was being in so much discomfort that I was barely sleeping at night. I walk to and from the train every day (about 1.5 miles each way), and for a variety of reasons (including a weekend) I skipped about 4 days. When I started walking again, I was so sore I could barely stand when I got home. I felt like I’d run a marathon. That’s when I realized something was SERIOUSLY not okay. I’d been ignoring aches and pains for a few months, but you can’t ignore that.

So my issue got so severe I had to pay attention, but those who I tell about it are sort of having the information solicited. Two different groups.

Step 4: Some people figure if some is good than more is better.

There’s a lot of debate over what an optimal Vitamin D level is, but it will not surprise you to know that I was not in the grey area. I found this chart (unfortunately no source) that shows some of the controversy:

For reference, my level was 12. No one appears to debate that I needed treatment.

The controversy has arisen over some of those in between groups, like those in the 20-40 range. Some people say that they need more, but if they lack clear symptoms and hover around 35, is that really true?

To take it a step further, some people with aches and pains just start taking Vitamin D assuming that they are deficient with no testing at all. This is where things start to go off the rails a bit.

Step 5: The backlash

Okay, so now we’ve got people on supplements for levels that may or may not be dangerous, and bottles flying off the shelves at stores to treat people who may not have a deficiency. That’s when some people start to say “okay, pump the brakes here”.

What was a miracle cure for people with clear symptoms and a definitive deficiency now moves to something of questionable benefit for many many others. That’s when you get doctors who start eyerolling at things, rightly or wrongly.

Now all of this isn’t to say that I object to supplements, or people trying to find things that work. I don’t. That’s a good thing.  However, for whatever reason, we do seem to forget that many supplements or medications are only miracles when someone is really not doing well. If you look at the controversy over statins for example, it’s clear that much of it got stirred up when doctors started prescribing them to people with very few risk factors. The evidence that they work for those with high risk of heart attacks or stroke are pretty good, the evidence that they work for people with low risk is not great. I think we all want to believe that something that can take someone at high risk/severe pain back to normal will help those with mild risk/mild pain get back their too, but it doesn’t always happen that way.

I think this comes back to our weird tendency to assume all relationships are linear. Just another reminder that you can’t assume that. Now excuse me, I’m going to get a bit of sun.

On Accurate Evaluation

It’s no secret that I have a deep fascination with people’s opinions about “popular opinion”. While sometimes popular opinion is easy to ascertain, I’ve noticed that accurately assessing what “most people know/believe” is a bit of an art form. This is particularly true in the era of social media hottakes, all of which seem to take the form of “this thing you love is terrible” or “this thing you hate is actually great”.

I have such an obsession with this phenomena that I gave it a name (the Tim Tebow effect) which I define as “The tendency to increase the strength of a belief based on an incorrect perception that your viewpoint is underrepresented in the public discourse”.

I was thinking about this recently after reading the Slate Star Codex post on the “Intellectual Dark Web” called “Can Things Be Both Popular and Silenced?” In typical SSC fashion it’s really long and very thorough, and basically discusses how many different ways there are of measuring things like “popular” and “silenced”. For example, Jordan Peterson appears to make an absurd amount of money through Patreon ($19-84k per month by this estimate), so in some sense he is clearly popular. OTOH, he has also had threats made against him and people attempt to shut down his lectures, so in some sense there are also attempts to silence him. It’s this tension that Alexander explores, and he covers a lot of ground.

Given that my brain tends to uh, bounce around a little bit, this essay got me thinking of another topic entirely: the situation of women in the Victorian Era.

This connects, I promise.

Anyone who knows me or has seen my Kindle knows that I have a very bad habit of acquiring an enormous backlog of books to read. It’s so bad that I keep a running spreadsheet of how much I should be reading each week, because of course I do, and I tend to be flipping between at least a dozen at a time. Recently I picked up two I’d had hanging around for a while Unmentionable: the Victorian Lady’s Guide to Sex, Marriage and Manners and Victorian Secrets: What a Corset Taught Me about the Past, the Present, and Myself. I had thought these two would go well together as they appeared to be on the same topic, but they ended up being almost diametrically opposed.

Unmentionable took the stance that we all (or at least women) idolized the Victorian era, and it’s stated goal was to make us realized how bad it actually was. Victorian Secrets OTOH took the stance that we all thought too little of the Victorian era, and wanted to explain some of the good things about it.

I spent a lot of time mulling those two statements, and ended up deciding that they really both had some truth to them. In Unmentionable, she talks about how Jane Austen movies make it all look like romance and pretty dresses, which is a fair charge. Her chapters on how those pretty dresses were never washed, and how you’re not taking a shower or washing your hair much, and how unsanitary most things are was pretty interesting and made me quite grateful for modern conveniences. In Victorian Secrets, the author wore a corset for a year and ended up wearing lots of other Victorian clothes, and mentioned that the corset had gotten a rather unfair rap. She had done a lot of research and had some interesting points about how Victorian’s weren’t as backwards as they are sometimes portrayed. This also felt fair.

Interestingly, in order to make their points, both authors relied on different sources. Unmentionable stuck to advice from books and magazines during the era, and Victorian Secrets made the case that trying to mimic the habits of everyday people from an era was the path to understanding. I suspect both methods have their pros and cons. A person from the year 2150 trying to read Cosmopolitan magazine would get a very different impression of our era than someone who walked around in our (now vintage) clothing. Both would have truth to them, but neither would be the whole picture.

I think this ties in to all these discussions about “popular opinion” or “the general consensus”, because I like the thought that sometimes there can be competing popular opinions on the same topic. Pride and Prejudice is still a favorite book for many girls because they both love the romance and the feel of the era, while also disliking all the rules and the lack of choice for women. While I’m sure there are some women who either love the Victorian Era or hate it, I’d actually suspect that many women love the thought of parts of it and dislike others. Given that most of us have very little exposure to it outside of a brief mention in history class and our English Lit curriculum, it is entirely possible that the likes and dislikes could be somewhat ill informed. This actually leaves a good bit of room for authors to truthfully claim “your love is misguided” and “so is your dislike”.

Yet another way popular opinion gets slippery when you try to nail it down.

By the way, weird fact about me: I’ve never actually read Pride and Prejudice, only Sense and Sensibility. I think my high school English teacher was getting a little bored with P&P by the time I got there, and I never picked it up on my own. I’ve seen two film versions and I read Bridget Jones Diary though, so I pretty much got the gist.

Just kidding librarian/English teacher friends, adding it to my spreadsheet now.

Tick Season 2018

I was walking in the woods yesterday, on a trail on the property I grew up on. In the course of a 45 minute walk on a (mostly) clear trail I had to pull at least 12 ticks off of me. Between the group of 3 adults and 1 child who went for a walk, we estimate we pulled 40 off of us.

I’ve been walking that trail for decades now and that was by far the worst I’d seen it.

Anyone else seeing similar things this year? It looks like the were predicting a tough year for New England, and the CDC has been warning about an increase in tick-borne illnesses, but I didn’t think it would be quite that bad.

Related: Vox did a good explainer about Lyme Disease a few weeks ago, which is worth reading if you don’t know much about it.  I’ve had a family members and friends have rather scary experiences with it, and it’s worth learning about if you’re in (or traveling to) an affected area.

Observer Effects: 3 Studies With Interesting Findings

I’ve gotten a few links lately on the topic of “observer effects”/confirmation biases, basically the idea that the process of observing a phenomena can actually influence the phenomena you’re observing. This is an interesting issue to grapple with, and there’s a lot of misconceptions out there, so it seemed about right for a blog post.

First up, we have a paper on the Hawthorne effect. The Hawthorne effect was originally a study done on factory workers (in the Hawthorne factory) in order to see how varying their working conditions  improved their productivity. What the researchers found was that changing basically anything in the factory work environment ended up changing worker productivity. This was so surprising it ended up being dubbed “the Hawthorne effect”. But was it real?

Well, likely yes, but the initial data was not nearly as interesting as reported. For several decades it appeared to have been lost entirely, but it was found again back in 2011. The results were published here, and it turns out most of the initial effect was due to the fact that all the lighting conditions were changed over the weekend, and the productivity was measured on Monday. No effort was made to separate the “had a day off” effect from the effect of varying the conditions, so the 2011 paper attempted to do that. They found subtle differences, but nothing as large as originally reported. The authors state they believe the effect is probably real, but not as dramatic as often explained.

Next up, we have this blog post that summarizes the controversy over the “Pygmalion effect“. (h/t Assistant Village Idiot). This another pretty famous study that showed that when teachers believed they were teaching high IQ children, the children’s actual IQs ended up going up. Or did they? It turns out there’s a lot of controversy over this one, and like the Hawthorne effect paper the legend around the study may have outpaced its actual findings. The criticisms were summed up in this meta-analysis from 2005:

  1. Self-fulfilling prophecies in the classroom do occur, but these effects are typically small, they do not accumulate greatly across perceivers or over time, and they may be more likely to dissipate than accumulate
  2. Powerful self-fulfilling prophecies may selectively occur among students from stigmatized social groups
  3. Whether self-fulfilling prophecies affect intelligence, and whether they in general do more harm than good, remains unclear
  4. Teacher expectations may predict student outcomes more because these expectations are accurate than because they are self-fulfilling.

I find the criticisms of both studies interesting not because I think either effect is completely wrong, but because these two studies are so widely taught as definitively right. I double checked the two psych textbooks I have laying around and both mention these studies positively, with no mention of controversy. Interestingly, the Wikipedia pages for both go in to the concerns….score one for editing in real time.

Finally, here’s an observation effect study I haven’t seen any criticism of that has me intrigued “Mind Over Milkshakes: Mindsets, Not Just Nutrients, Determine Ghrelin Response” (h/t Carbsane in this post). For this study, the researchers gave people a 380 calorie milkshake two weeks in a row, and measured their ghrelin response to it. The catch? In one case it was labeled as a 620 calorie “indulgent” milkshake, and in the other case it was labeled as a 120 calorie “sensible” shake. The ghrelin responses are seen below:

This is a pretty brilliant test, as everyone served as their own control group. Each person got each shake once, and it was the same shake in each case. Not sure how large the resulting impact on appetite would be, but it’s an interesting finding regardless.

Overall I think the entire subject of how viewing things can change reality is rather fascinating. For the ghrelin example in particular, it’s interesting to see how a hormone none of us could consciously manipulate can still be manipulated by our expectations. It’s also interesting to see the limitations of what can be manipulated. For the Pygmalion effect, it’s found that if the teachers know the kids for at least 2 weeks prior to getting IQ information, there is actually no effect whatsoever. Familiarity appears to breed accurate assessments I suppose. All of this seems to point to the idea that observation does something, but the magnitude of the change may not be easy to predict. Things to ponder.

What I’m Reading: May 2018

I saw a few headlines about  new law in Michigan that would exempt most white Medicaid recipients from work requirements, but keep the work requirement for most black people in the same spot. This sounded like a terrible plan,  so I went looking for some background and found this article that explains the whole thing. Basically some lawmakers thought that the work requirements didn’t make sense for people who lived in areas of high unemployment, but they decided to calculate employment at the county level. This meant that 8 rural-ish counties had their residents exempted, but Detroit and Flint did not. Those cities have really high unemployment, but they sit in the middle of counties that do not. The complaints here seem valid to me….city dwellers tend not to have things like cars, so the idea that they can reverse commute out to the suburbs may be a stretch. 10 miles in a rural area is really different from 10 miles in the middle of a city (see also: food deserts/access issues/etc). Seems like a bit of a denominator dispute.

I’ve talked before about radicalization of people via YouTube, and this Slate article touched on a related phenomena: Netflix and Amazon documentaries. With the relative ease of putting content up on these platforms, things like 9/11 truther or anti-vaccine documentaries  have found a home.  It’s not clear what can be done about it unfortunately, but it’s a good thing to pay attention to.

I liked this piece from Data Colada on “the (surprising?) shape of the file drawer“.  It starts out with a pretty basic question: if we’re using p<.05 as a test for significance, how many studies does a  researcher before he/she gets a significant effect where none should exist? While most people (who are interested in this sort of thing) get the average right (20), what he points out is that most of us do not intuit the median (14) or mode (1) for the same question. His hypothesis is that we’re all thinking about this as a normal distribution, when really it’s geometric. In other words the “number of studies” graph would look like this (figure from the Data Colada post):

And that’s what it would look like if everyone was being honest or only had one hypothesis at a time.

Andrew Gelman does an interesting quick take post on why he thinks the replication crisis is centered around social psychology. In short: lower budget/easier to replicate studies (in comparison to biomedicine), less proprietary data, vaguer hypotheses, and the biggest financial rewards come through TED talks/book tours.

Given my own recent bought with Vitamin D deficiency, I was rather alarmed to read that 80% of African Americans were deficient in Vitamin D. I did some digging and found that apparently the test used to diagnose Vitamin D deficiency is actually not equally valid across all races, and the suspicion is that African Americans in particular are not served well by the current test. Yet another reason to not assume research generalizes outside it’s initial target population.

This Twitter thread covered a “healthy diets create more food waste” study that was getting some headlines. Spoiler alert: it’s because fruits and veggies go bad and people throw them out, whereas they tend to eat all the junk food or meat they buy. In other words, if you’re looking at environmental impact of your food, you should look at food eaten + food wasted, not just food wasted. The fact that you finish the bag of Doritos but don’t eat all your corn on the cob doesn’t mean the Doritos are the winner here.