True Crime Times

I have a Substack piece up today on the True Crime Times: What True Crime Can Learn from the Science of Getting Things Wrong.

This is an abbreviated version of my True Crime Replication Crisis series, albeit aimed towards a true crime audience rather than my usual folks here. I was pleased to see the first comment was from someone who also likes to rant about statistics. If you’re looking for that, I’m always here for you. Please see my Intro to Internet Science series for details.

College Costs Stopped Spiraling About a Decade Ago

When you talk about the economy and the state of young people today, you will almost always here about how young people are drowning with student debt. I took this statement at face value, until a few months ago when I saw someone make a comment that college cost creep had largely slowed down. I didn’t follow up much more on it until I saw the recent Astral Star Codex post about the Vibecession data, and he confirmed that the high water mark for student loan debt was actually 2010 (Note that this is debt per enrolled student, so those numbers aren’t impacted by changes in enrollment):

His theory is that 2010 is around when large numbers of people first started maxing out the $31k cap on many government loans, and that that cap hasn’t moved. I think there’s two other things going on:

  1. The youth (16-24) unemployment rate was above 15% for about 4 solid years after the 2008 crash: 2009-2013. This is the worst youth unemployment since the early 80s and it actually hasn’t been replicated. The COVID unemployment spike lasted about 5 months (April-August 2020). It dropped back below 15% by September 2020 and was below 10% by June 2021. 4 years of bad job prospects leads to a “well may as well finish my degree/go back to grad school” type thinking in a way 5 months of bad job prospects don’t.
  2. 2012 is generally considered the inflection point for smartphones/internet adoption, and this opened up a lot of low cost options for online college. I don’t have good comparison numbers, but today in 2025 you can get a bachelor’s degree for $10k/year at Southern New Hampshire University, and they helpfully point out some of their “pricey” competitors are $15k/year. Adjust for inflation, that would be the equivalent of a 2010 student being able to get tuition for about $6800/year. I was not shopping for a degree at that time, but Google tells me you would have paid triple that for my state school that year.

Now you can find websites that say student loan debt is going up, but from what I can find these graphs don’t inflation adjust their data. The complaints about how a $100,000 salary isn’t what it used to be are accurate, but by the same token a $100,000 debt isn’t what it used to be. Looking at the top graph on this website for example, you wouldn’t know that a $17.5k loan in the year 2000 is actually about $34k today, remarkably close to the $35k they say 2025 graduates are taking out.

Ok, so that’s debt, but what about the sticker price? Well, the College Board puts together a pretty useful report on the topic that shows a few things:

For public universities, we see a slow down in tuition increases starting about a decade ago, and for private schools the change happens about 5 years ago. For all schools we see the inflation adjusted cost is currently lower than it was a decade ago. But wait, there’s more!

The above graphs are just the sticker price. The College Board also tracks the “real cost” after factoring in grants. Here’s the data for private 4 year colleges:

Note the “grant aid” line, which slowed during the 2008 crash, but then has been ticking upward starting in 2013 and hasn’t stopped. To emphasize, those are grants, not loans. That’s just money off the sticker price. According to the College Board, the net cost of attendance of a college in 2024 was less than in 2006. I won’t keep pummeling you with graphs, but for private 4 year, public 4 year and public 2 year colleges, the “real” cost peaked in 2016.

I am not an economist, but the numbers suggest a pretty clear story to me. When unemployment for those under 25 was high in 2009-2013, going to college, any college, seemed like a good financial move. For many, it probably was. Then, as employment picked up, students were able to get choosier and consider the cost of student loan debt in their choices. Very quickly colleges started upping the amount of grants offered, and then stopped increasing the sticker price. With recent inflation, the price increases actually dropped below the inflation line and now the real cost of college is dropping.

Additionally, technology improvements allowed online schools to start offering cheaper tuition at a large scale. This might have only made a small dent, except then the pandemic happened and traditional campus life was upended. This made the difference between going to a traditional college and an online college much smaller, and based on those I know with kids that age a lot of kids opted for at least a year or two at a cheaper online school rather than pay through the nose to sit in their dorm room all day. This put additional cost pressure on schools, and we see the prices tick down further.

All that being said, there is a not-small group of people who were pretty slammed by college costs: those who were coming of age during the 2008 financial crash and it’s aftermath. However, most of those people are actually in their late 30s now, and it’s important to note that state of affairs did not persist for those who came after them. Times change, sometimes for the better.

My Favorite Book of the Year: The Age of Diagnosis

As 2025 comes to a close and we careen towards Christmas and giving season, I wanted to put in a plug for my favorite book of the year. The book is The Age of Diagnosis: How Our Obsession with Medical Labels Is Making Us Sicker, and I enjoyed it immensely. I found it originally when Jesse Singal did a Substack post called “Long Covid can be both Psychosomatic and Real”, and immediately forwarded it to my sister (an NP), who promptly got the book and then immediately called me to talk about it. She was annoyed I hadn’t actually read it yet, so I got the book and could see why she was calling. This is a book you want to talk about with people.

The author is a UK neurologist and a skilled writer, and she dares to ask the question “what is the point of diagnosing people with things”. She points out that diagnosis is supposed to be used strictly to inform treatment options, but we’ve completely overlooked the psychological impact a diagnosis can have. She starts with the example of Huntington’s disease, a fatal genetic disease that you can test for and diagnose, but for which there is currently no cure. Prior to the advent of testing, 90% of patients and their families said they would love to have a test. Once one was developed however, the decision to test or not proved a lot harder for people than they had expected.

She goes on to cover many other areas of medicine: COVID, chronic Lyme, autism, ADHD, cancer screenings, and points out repeatedly that there are two ways to be wrong. Missing a diagnosis you could have treated is obviously bad, but giving someone a diagnosis they may not have also carries a risk. It’s that second risk she explores for both physical and psychological illnesses. What does happen if you think you have a disorder that you don’t? Does disorder creep carry a cost? If your diagnosis makes you feel better about yourself but actually doesn’t improve your objective functioning or even worsens it, should it really have been given? Shouldn’t we be, you know, studying some of these questions?

I liked this book because I’ve spent a lot of time in the last 7 years or so thinking about the purpose of diagnoses and what they’re good for. Back in 2019 I wrote about my lengthy journey to getting diagnosed with chronic migraines (they had an atypical presentation at first), and it was a great relief to finally getting a name to my issue. However, it still took years to get a treatment regimen that worked, and I still have problems. I also have a new appreciation for psychosomatic illness because the migraines have messed up my sense of pain quite a bit. I now have to let every health care provider I have know that my sense of pain is not a great guiding light, in either direction. I have felt pain in places that appeared to have nothing actually wrong with them, and failed to recognize pain in other places because I thought it was part of the regular pain I have. Not having your senses work predictably is a huge disadvantage in diagnosis, but there are more people this happens to than you think. One highlight of the book was when she notes many people experience psychological pain as physical pain, and get slapped with every escalating numbers of diagnoses while trying to treat it. This isn’t good for anyone.

A related read this week was Accommodation Nation in the Atlantic, which points out that now over 20% of students at elite universities have a disability on file. This is a rate far higher than less elite universities, and the disabilities are primarily autism, ADHD and anxiety, and again makes us wonder what a diagnosis is really for. If the best and brightest are claiming to be disproportionately impaired, what are we really looking at here?

What The Age of Diagnosis highlights, sometimes uncomfortably, is that our institutions haven’t caught up to the psychological and social power of a label. In an era where traditional communities seem to be shrinking, we run the risk of allowing diagnoses to take a disproportionate role in the way we define ourselves. Books like this don’t offer easy answers, but they do give us the vocabulary to ask better questions about how we allocate care, how we define impairment, and what we actually want our diagnostic categories to accomplish in a world where they shape so much of public and private life.

There Weren’t Just 2 Scientific Advances that Made the Sexual Revolution Possible, There Were 4

There’s a Bret Weinstein speech going around on Twitter where he makes a comment about how birth control and abortion changed the game around sex, commonly known as the sexual revolution that occurred in the 1950s-1970s. I have not listened to his speech so I have no comment on what he was saying specifically, but in reading some of the comments I was interested that when people discuss “what changed” during the 1950s through the 1970s, they seem to focus on just abortion and birth control on repeat. Even the Wikipedia page for the sexual revolution only mentions these two. Those things absolutely changed behavior, but I think there’s two more things that need to be a bigger part of the discussion:

  1. Paternity testing
  2. Antibiotics

Paternity testing started out with blood testing in the 1920s, but hit it’s stride in the 1960s with HLA testing. Prior to that, you had to use social rules and general vibes to determine paternity. It largely relied on people’s own truthfulness. Prior to paternity testing, marriage was the most surefire way to ensure no one questioned whose kids were whose, but after we got a better method the number of kids born to single moms went from 5% to 40%. You can see that as good/bad/neutral, but that almost certainly doesn’t happen without the ability to identify a father accurately.

As for antibiotics Penicillin was discovered in 1928, but WWII sped up the perfection of antibiotics for treatment of bacterial infections, and widespread for the public use came in during the 1950s. From 1935 to 1968, 12 new classes of antibiotics were launched. Prior to this, basic STDs like syphilis were actually killing people at a rate similar to suicide today:

And that’s just deaths from syphilis, not cases. That figure comes from this analysis, which notes that prior treatment methods may have been as effective, but they were expensive and time consuming, and penicillin just made everything easier. Of course, syphilis is just one of the diseases people were dodging, chlamydia and gonorrhea also would have been issues. Antibiotics changed the game here.

I bring these up not to take any particular stance on any issue, but to point out that the past was very different in ways we don’t often think about. Even if somehow birth control and abortion were wiped off the face of the planet today, antibiotics and paternity testing would still ensure our population level practices around sex were different than they were 100 years ago. Sexual mores were never just about pregnancy, they were also about ensuring you could establish paternity and avoid STDs.

I think this is important for both cultural conservatives and cultural liberals to remember, as at times we can look at the past as either a golden era of morality or a deep pit of oppression. But in prior “moral” eras, a lot of sexual behavior was kept in check by people lying or threatening to lie about true things, and paternity testing stopped that. Conversely, things like religion may never have had quite the level of influence we attribute to them, they were often coping with very real issues around STD control in an era when the medical community couldn’t help much. When those things changed, behavior changed. It’s a good reminder that most social changes have several causes, and are not just related to one thing.

To note: the things I mention above are those I believe had a direct impact on sexual issues in the 1950s-1970s specifically. There’s a few other advances that probably changed sexual behavior in a slightly less direct fashion: cars (teenagers could go see each other more easily), at home pregnancy tests (earlier identification of pregnancy, no doctor needed), mass distribution of porn (TBD), dating apps (thank God I missed that era).

Anything else I missed?

Snip Happens: A Study in Hypothetical Hair Sabotage

Earlier this week, the Assistant Village Idiot tagged me in one of his link roundups:

Off With Her Hair Women tell attractive women to cut their hair. The study’s authors are all female.  I wonder what it is like for women studying female intrasexual competition. Is it harder to get along, or easier? Bethany, you need to get in on researching the women who research women.

I’ll admit I got a kick out of this, in part because I love a good gender study, and in part because I have REALLY long hair. I mostly wear it up, but it’s the kind of hair that people actually say “whoa, I had no idea it was that long” if I take it down. I call it homeschool hair. The last time I wore it down for an extended period of time, someone (who I knew) stopped me and asked if she could take a picture of it. I have no particular attachment to this style, but I actually don’t like haircuts, so here we are.

I hadn’t yet had a chance to dive in to the study, when a Tweet popped up on a similar topic:

It actually came to my attention because a few people immediately pointed out that these women were in a no win situation: if they’d told their coworker “she looked like shit” they would be considered catty, but if they tell her it looks good they are intrasexually competitive. Additionally, they were coworkers of hers, not friends, and it’s pretty weird to expect that all women at all moments must be aiding every other woman they know with her appearance. I suppose there’s an option where they could have tried to be pleasant but not endorse the haircut, but that’s a very hard tone to hit correctly and honestly? I’ve also seen plenty of male coworkers say things “looked great” when other males came in proud of some new thing they did/purchased/whatever. Why start conflict with a coworker for no reason?

All of this prompted me to deep dive in to this study, to see what they found. Ready? Let’s go!

Study Set Up

So the basic set up of the study is that 200ish (mostly college aged) women were recruited for a series of two studies. In both, they had a series of female faces cropped to the shoulders like this:

The women studied were supposed to suggest how many centimeters (they were Australian) they were supposed to cut off. They were given the picture of the woman, an assessment of the hair’s condition and then how much hair the woman was comfortable cutting off. Those last two were a binary: hair condition was either good/damaged and the requested length of cut was either as much as needed/as little as possible. After that they asked women to rank themselves on a few different scales, including one that measured intrasexual competitiveness.

What’s intrasexual competitiveness you might ask? Well, it’s apparently a 12 question measure that asks you stuff about how you feel about those of your gender who might be better than you on some level. The questions they mention are things like asking you to agree/disagree with statements like “I just don’t like very ambitious women” or “I tend to look for negative characteristics in attractive women”. Highly intrasexually competitive women are those who answer that they strongly agree with questions like that.

They hypothesized that women who scored high on this scale might be more aggressive with their recommendations to other women about how much hair the should cut off, under the idea that men like long hair and this would be sabotaging other women who might be competitors to them. And to be honest, this sounds like a pretty plausible hypothesis to me! These are women who just answered a bunch of questions reiterating that they really didn’t particularly like other women, I would imagine they’d actually end up being meaner to other women than people who disagreed with those statements. It reminded me of someone who recently pointed this out about introvert/extravert tests: they will ask a bunch of people if they like big groups of people, and then you call those who said “no” introverts, then we declare that we found introverts don’t really like parties. I mean, that makes sense! But it does at times seem like most of the sorting already took place before we even got to the study itself. But I digress, let’s keep going.

The Findings

Ok, so the first thing that caught my eye is that the primary finding of the study is that all women, regardless of scale ranking, first and foremost based their haircut recommendations on two things:

  1. The condition of the woman’s hair (those with damaged hair were told to get more cut off)
  2. The hypothetical client’s stated preference (it was followed).

So to be clear, it was found that even women who stated they didn’t much like other women primarily based their recommendations on what was best for the other woman and what they other woman wanted. And it wasn’t even close. Every other effect we are going to talk about was much smaller both in absolute value and in statistical significance. Here’s the graph:

To orient you, the top panel is the recommendations for healthy hair, the bottom is the recommendations for unhealthy hair. As you can see, in general the difference in recommendations based on that condition alone is quite large, around the 2cm (a bit under inch for us USA folks) range for all conditions. The second biggest impact was what women wanted, which made a difference of about 1-1.5cm in the recommendations. Then we get to everything else.

It’s important to note that despite how this topic often gets introduced, there was no significant effect found based on attractiveness in general. This is notable because like the Tweet above shows, this stuff is often portrayed in popular culture as something “women” do, and we don’t have much proof that it is! They did find an attractiveness effect for the women with healthy hair being judged by regular and highly competitive women, but it went the opposite way: it was actually unattractive women who got the recommendation to cut off more hair. And again, the difference was a fraction of the impact of the other two factors: somewhere between .1-.2cm. For those of us in the US, that’s less than 1/10 of an inch. A quick Google suggests that’s less than a weeks worth of hair growth, and certainly not enough for anyone to notice.

I think it’s good to hammer on this because if I told you someone was out to sabotage you, you might be worried. But if I told you someone was out to sabotage you but they’d first do what was best for you, then follow what you wanted, then would sabotage you so subtly it would be imperceptible to the naked eye…..well, you’d probably calm down substantially. Much like when we see studies like “eating eggs will double your risk of heart disease in your 50s (from .001 to .002 per thousand)”, we need to be careful when we are quoting results like this that find a near imperceptible difference that can be fixed with 5 days of regularly hair growth.

But back to the finding that attractive women didn’t actually get penalized and instead the slight increase in hair cut recommendation was aimed the other direction, the study authors conclude the following:

This suggests that appearance advice may act as a vector for intrasexual competition, and that such competition (in this scenario at least) tends to be projected downward to less attractive competitors.

I will admit that annoyed me a bit, because this means that ANY variation is now considered to prove the thesis. They stated this was ok because there was no active “mate threat”, so they would expect it to go this way, but I will point out if attractive women had been penalized it would have also been considered proof. Having just finished our series on the replication crisis, I will point out that explaining every finding as proving your original thesis is a big driver of non-replicated findings.

Moving on to the second study though, the study authors did a few really smart changes to their set up. First, they provided participants with a picture of a ruler and a credit card up front so they’d actually have a reminder of what different lengths meant. They also changed from using a text box for the answers to “how much hair would you recommend they cut off” to using a Likert scale type set up where you had to recommend a whole number 1-10 cm. I liked that these changes were there because it showed a good faith effort to improve the results. In this condition, they added faces that were considered “average” to the mix and repeated most of the same experiment.

The findings were similar. The biggest variations were based on hair damage and client wishes, with relatively small differences .1-.2cm appearing across different individual groups. The graph that got the headline though is this one:

This is the graph they used for the title of the study, and it comes from dropping the whole clients wishes/hair damage thing and just looking at the overall amount of hair these women suggested be removed for anyone. You will note again the variation across attractiveness levels is .1-.2cm, but indeed the “high” intrasexual competitiveness women recommend more than the other two groups. The highest recommendation is about .8cm higher than the lowest value. That’s about 1/3 of an inch. Not enough for you to visually notice, but still something.

What caught my eye though was that we only really saw variation with the high and low group, which got me wondering how many women were in each category. And that’s where I found something interesting. In the first study, they defined “high” and “low” intrasexual competitiveness as being 1 SD from the mean. Assuming a normal distribution, that would mean about 16% of the sample were in the high/low groups, and the remaining 68% were in the average group. For this study though, they changed it to 1.5 SD, which means a little less than 7% of the group are in the high/low groups. Given the sample size of around 250, we’re looking at about 17 people in both the high and low group (34 people total) and 216 or so in the average group. By itself that will lead to higher variation in the groups with smaller sample sizes. You will note there is very little variation in what the group with most of the participants answered.

My thoughts

So like I said at the beginning, I find this study’s conclusion fairly plausible. The idea that women who specifically state they don’t like other women will give other women worse advice just kind of makes sense. But a few thoughts:

  1. The main findings weren’t mentioned. The title and gist of this study was presented as “intrasexually competitive women advise other women to cut more hair off”, but it could just as easily have been “intrasexually competitive women primarily take other women’s best interest and preferences in to account” and it would be just as (if not more) accurate. The extra hair cut is presented as a primary driver of haircut recommendations, but really it’s in a distant third to the other two. This is fine for academic research, but if you’re trying to talk about how this applies to real life, it’s probably good to note that women actually gave quite reasonable advice, with slight variation around the edges.
  2. The absolute value was never discussed. I was curious if the authors would bring up the small absolute findings as part of their discussion, and alas, they did not. The AVI let me know he found the link in Rob Henderson’s post here, and I was amused to find this line one paragraph before his discussion of this study: This is why reproductive suppression is primarily a female phenomenon. Of course, there have been cases of male suppression (e.g., eunuchs). Or men raiding a village and simply slaughtering all of the males and abducting the women as wives and concubines. But suppression among women is subtler. If by subtler you mean 2mm of extra hair, then yes. If I had to pick between that and murder and castration, I admit I’m feeling women got the better end of the deal here. If you would keep eating eggs (or whatever other food) that was associated with a tiny increase in cancer, then you probably can’t take this hair cutting study as a good sign of intrasexual competition. How are women sabotaging other women if they are doing so at a level most men wouldn’t notice? I suspect there’s an assumption this effect is magnified in real life, but again, this study doesn’t prove that.
  3. Motives are assumed. Much like in the critiques of the Tweet above, I noticed that through the paper the authors explained why targeting attractive women, average women and unattractive women would all be intrasexual competition. What I did not see was any attempt to consider non-intrasexual competition reasons. Maybe people suggest unattractive people cut more hair off because they think they should try a different look? Maybe scoring high on a intrasexual competition survey is an indication of aggressiveness, and aggressiveness correlates to more aggressive hair cutting? Unclear, but I will note the idea that all variances could only be explained by intrasexual competition surprised me, particularly when we’re discussing effects that are likely too subtle to be spotted by the opposite sex.
  4. We don’t know this is a female only phenomena. Despite Rob Henderson’s claim above, you will be unsurprised to hear no one (that I could find) has ever done this study on men. I actually would have been interested to see that study, even if it was men making suggestions for female hair. One reason I’d like to see this is because I heavily suspect men would be somewhat more erratic in their rankings, which would actually increase the risk of spurious findings. Frankly, that would amuse me to watch people have to explain why their statistically significant findings were still meaningful, or to have to admit sometimes that just happens and it doesn’t mean anything at all. But still, we’re told constantly that “subtle” sabotage is a woman thing, but I actually couldn’t find any studies suggesting people were looking at this. Might be interesting.

Ok, well that’s all I have! Thanks for reading, and I’m going to go consider cutting my hair an amount no one will notice, just for fun.

The True Crime Replication Crisis Part 8: Consequences

Well we’ve reached the end of the road here folks, and it’s time to wrap things up with some conclusions and consequences. As I mentioned in the first post, I’ve been loosely following the Wikipedia entry on the replication crisis, and I’d like to point out the first paragraph of it’s consequences section (bolding mine):

When effects are wrongly stated as relevant in the literature, failure to detect this by replication will lead to the canonization of such false facts.[195]

A 2021 study found that papers in leading general interest, psychology and economics journals with findings that could not be replicated tend to be cited more over time than reproducible research papers, likely because these results are surprising or interesting. The trend is not affected by publication of failed reproductions, after which only 12% of papers that cite the original research will mention the failed replication.[196][197] Further, experts are able to predict which studies will be replicable, leading the authors of the 2021 study, Marta Serra-Garcia and Uri Gneezy, to conclude that experts apply lower standards to interesting results when deciding whether to publish them.[197]

So overall we find that in science, with highly educated PhDs with professional reputations and institutional affiliations built on truth we find that:

  • False facts end up being canonized
  • Less reliable studies get more attention
  • Even when findings are formally challenged, they will continue to be repeated as true with almost no one mentioning they were called in to question
  • Standards are lower for anything surprising or interesting

Do we really believe that Youtubers and TikTokers are actually more reliable than this, while they compete for nothing but attention? I hate to beat a dead horse, but papers can get retracted, colleges can investigate you, and you can sink a career in academia. Maybe not often, but the odds are certainly better than even a mainstream journalist actually losing a defamation case. Science is set up to self police, maybe not as well as it should be, but there are mechanisms. True crime documentaries and podcasts are set up to entertain, and there are no mechanisms to self correct outside of a person getting aggravated enough to file a lawsuit against you. So it is very likely that:

  • Some portion of what you believe you know about popular cases is flat out false
  • The most popular cases will have more incorrect facts floating around than the “boring” cases
  • Even when things are proven to be incorrect, they will not stop circulating as fact
  • Standards are lower for anything surprising or interesting

So what do we do?

Well, it’s actually not straightforward. Because of the apparatus around science, it’s been straightforward to propose changes. Change hasn’t always come fast, but it has been progressing. True crime has no such oversight, so any change will be a challenge. However, I think the things I used to bring up in my Intro to Internet Science Course still all apply here. I broke down the things to watch for in to 4 categories: Presentation: How They Reel You In, Pictures: Trying to Distract You, Proof: Using Number to Deceive, and People: Our Own Worst Enemy. I think those still all apply here, with just a few tweaks.

  1. Presentation: How They Reel You In A high production value documentary is not the same as an honest documentary, and a lengthy series on a topic does not mean people didn’t leave anything out. Be skeptical of things, no matter how glossy or voluminous.
  2. Pictures: Trying to Distract You In the stats and data world, graphs are often used to catch people’s eye and give them the immediate visual impression something is happening before they’ve had a chance to read anything. In true crime, this is often what the victims or the perpetrator look like, immediately playing on tropes of who we think commits crimes or which victims get our sympathy. Be skeptical of anything that focuses on the good looking, wealthy or college educated to the exclusion of others. Additionally, watch any attempt to immediately invoke another case or movie in the current case, which will prime you to skip actual facts in favor of an “I know this type of person, they do X”. When our local case hit national media, one of the first things one of the main people did was to start citing a popular movie filmed in the area almost 20 year ago, based somewhat on events that had occurred 20-30 years prior to that. The attempt to evoke specific imagery was clear.
  3. Proof: Using Number to Deceive While numbers aren’t always at play in the true crime world, evidence certainly gets kicked around pretty often. But just like numbers, out of context evidence is often worse than useless and extremely misleading.
  4. People: Our Own Worst Enemy We bring our biases to every case, and some narratives will be more palatable to us than others. Be careful with people who bring cases in to make a “bigger point” or anything that seems a little too outrageous or focuses on extremely unusual types of crime. It’s also good to look back on early reporting and see if what got you in to the case held up, and to actually take it in to account if it didn’t.

To all of this, I’d add two more points. The first is that a surprising number of people tell me that true crime is fine playing fast and loose with the facts as long as it challenges the police, because there the state has more power. This is of course how our whole justice system is set up, but I think it falls rather flat. In science we are taught that there are both type 1 errors (false positives) and type 2 errors (false negatives) and that both carry consequences. This is also true in the criminal justice system. Blackstone’s principal says that it’s better that ten guilty men go free than one innocent man hang, and that is what we build our system around. But this doesn’t mean there’s no consequences to a guilty person going free. The obvious first issue is that they offend again, and that we will then also be upset that nobody stopped them. But this is a natural consequence of “it’s never bad to let the accused go”, and we can’t have it both ways. A recent Twitter thread highlighted this from a victims perspective, as she recounted both the emotional toll of testifying against a stranger who assaulted her and then watching him get let go repeatedly just to watch him continue to assault other women. The other issue of course is that if you have a justice system that never finds anyone guilty, people take things in to their own hands. It’s commonly noted that the mafia initially gained power with immigrant Italian communities because the police wouldn’t investigate crimes against them, and the same is true of newer gangs. Likewise, the Old Testament is riddled with references to the sin of denying justice. Even if you’re not religious, it’s good to flag that unpaid for crimes have been considered a socially destabilizing force for thousands of years. Playing fast and loose with the truth about government actions is not a victimless crime just because they have power, as people typically find when their particular group falls out of favor in the court of public opinion.

And finally, I want to give a mini rant about why this topic bothers me so much. Watching a case up close and personal like this, I was stunned and appalled how many people seemed to completely miss that this case was for many people, one of the darkest moments of their life even before the internet was involved. Watching people turn that in to their own personal whodunit/reality TV show was horrifying. People talked about the various people like they were merely characters in a movie, like you could say horrifying things about them with no consequences. I didn’t know these people but I do see many of them frequently, and the pain on their faces was visible. None of this was fun. None of this was asked for. We’re in a time when we have blockbuster documentaries about how exploitive reality TV show was, so it’s bizarre to me so many people are excited to tune in to stories about people who never volunteered for this. While errors in scientific publishing can erroneously impact how we view the world, errors in true crime reporting can irreparably ruin lives. The first one may sound worse, unless you’re the target of the second. Power posing failing to replicate hurt a few self help gurus talks, thousands of people falsely accusing someone of murder is something you probably never recover from. Consume media that reminds you that everyone involved, whether accused or victim, is a human.

Thanks for reading folks.

The True Crime Replication Crisis Part 7: Random Other Issues

Ok folks, so we’re nearing the end of our Wikipedia list of issues, so I’m at the point where I don’t know what to call this one. We have a bunch of random issues I’ll run through in order. Ready? Let’s go!

Context Sensitivity

In scientific study, context sensitivity refers to the idea that the same study performed under two different sets of circumstances might yield different results in ways people didn’t expect. This seems somewhat obvious when you say it directly, but often isn’t actually on people’s minds when they are reading a study. I have actually covered this a LOT on my blog over the years, as often people will make huge claims about how men or women (in general) view marriage, and you’ll find out the whole study was done on a group of 18 year old psychology students who are almost certainly not married or getting married any time soon. Zooming out, there’s a big criticism that most psychological research is done on “WEIRD” people, meaning Western, Educated, Industrialized, Rich and Democratic. What we consider settled science around human behavior may not be so settled if you include people from wildly different countries and contexts.

So how does this apply to true crime? Well, just like when I look up a paper the first thing I do is go to the methods section to understand the context in which the data was collected, I think the most important thing in a true crime story is to understand the big picture of where and how things happened. As I mentioned previously, true crime cases are often really unusual cases, so it’s important to flag any abnormalities will be heightened substantially. A few questions: how much crime is in the area in general? Were there any unusual events challenging people’s behavior? True crime often goes over this stuff, but I’ve noticed some cases breeze through contextualizing things or not acknowledging that unusual circumstances might change people’s behavior.

The other odd context thing is that a lot of people seem to think that because a case became well known later, the initial investigators should have been thinking from the get go how things would look on Dateline. Unfortunately most investigators/witnesses/defendants don’t have the luxury of knowing in the first 24 hours that people will be reviewing their actions for decades to come. If the case is OJ Simpson? Well yes, you should be prepared for that. If the case is Jon Benet Ramsey? You should give them some grace for not predicting the firestorm. Context matters.

Bayesian Explanation

This is similar to some of the statistical concerns I mentioned last week, but basically if you have a “surprising” result and a low powered study, Bayes theorem suggests you will have a high failure to replicate rate. Bayesian statistics can be powerful to help think through this, because they force you to consider how likely you thought something was before you ran your study, which can help you put your subsequent results in context.

So what’s the true crime equivalent? Well, I think it’s actually a good reminder to put all the evidence in context. Here’s an example: imagine a police department (or podcaster) believes a suspect is guilty mainly because they failed a polygraph. The polygraph has a low ability to detect real guilt (low power) and many innocent people fail it (high false-positive rate), and the prior likelihood that this particular person committed the crime is low. Even though the polygraph result says “guilty,” it does not mean there is a 95% chance they did it. Just like a weak psychological study, a “positive” polygraph doesn’t reliably tell you whether the hypothesis is true or whether the result will replicate.

This can be reapplied to all sorts of evidence, and should be, particularly when you have one piece of evidence that flies in the face of the rest of them. We even have a legal standard for this: circumstantial evidence, which can only be let in under certain circumstances. However in true crime reporting, a lot of circumstantial evidence is treated as extremely weighty, regardless of how discordant it is with everything else. You have to be honest about the prior probability or all your subsequent calculations are going to be skewed.

The Problem With Null Hypothesis Testing

This is a somewhat interesting theory, based on the idea that null hypothesis testing may not be appropriate for every field. For example, if you are testing whether or not a new drug helps cure cancer, you want to know if it has an effect or not. Pretty simple. But with a field like social psychology, human behavior may be too nuanced to have a true yes or no question. Running statistical tests that suggest there is a clear yes/no might end up with unreliable results because the whole set up was inappropriate for the question asked.

In true crime, this reminds me of people using legal standards as though they are moral standards or everyday standards we might use. For example, a person accused of rape may not be convicted under a reasonable doubt standard, but that doesn’t mean that you’d be ok with them dating your daughter/sister/friend. In murder cases, even when the police get things wrong they often had a good reason to start believing people were guilty. Drug or alcohol use can make people looks suspicious, lying up front to the police can make you look suspicious, prior similar convictions can make you look suspicious etc etc. I’ve seen a strong tendency for people to decide that whoever they favor is blameless (null hypothesis = absolutely nothing wrong), but as we covered last week a lot of people mixed up with legal trouble have something working against them.

Base Rate Fallacy

I’ve written about the base rate fallacy before, and it can be a tricky thing to overcome. In short, the base rate fallacy happens when something is extremely uncommon and you use an imperfect method to try to find it. For example, if you use an HIV test to test a thousand random people in the US for HIV, we know that 3-4 might have it. If you are using a test that is 99% accurate but has a 1% false positive rate, that actually means more people (10) will get a false positive result than a true positive result. When the frequency of something is low, false positives become a much bigger problem. In publishing, the theory is that previously unnoticed phenomena are getting rare, so surprising findings are increasingly likely to be false positives.

So how does this apply to true crime? Well, it’s a little hard to make a clear comparison, because so many crimes have unusual things happening by default. To take OJ Simpson as an example, it’s unusual for a celebrity of his stature to be accused of a crime. However, it’s also pretty unusual for a celebrity’s ex wife to end up dead like his did. Our base rate doesn’t totally work because we actually know something weird has happened. This is where we have to get back to judging people by evidence, not statistics.

However, in the broader scheme of true crime content, I think it’s good to note that the demand for new cases is currently exceeding the supply. As we’ve continued to cover, people want attractive articulate defendants with “interesting” cases, and we just don’t have that many of them. This creates a vacuum where people are very incentivized to make their cases “interesting” enough for true crime podcasters to pick up on. This is challenging because overall the murder rate in the US is down substantially from the 80s and 90s, so we have fewer current cases to draw from.

Alright, that’s all I have for this week. I’ll be looking to wrap up next week with a few lessons learned and thoughts. Thanks all!

To go to part 8, click here.

The True Crime Replication Crisis: Part 6 Statistical Errors

Welcome back folks! This week we’re still talking about true crime, and I’m going to cover how some statistical errors and how they relate to cognitive errors we see being made when we discuss true crime stories. Before I get to that though, I want to touch on a point made in the comments last week. David brought up that a good example of a fraudulent case that gained traction was the Duke Lacrosse rape accusation, which was ultimately found to be a false accusation. Many people continued to cling to it long after the evidence turned because they believed it was “an important conversation”. This sounds silly, but in the phenomenal “Toxoplasma of Rage” essay by Scott Alexander over at Slate Star Codex, he points out the following:

The University of Virginia rape case profiled in Rolling Stone has fallen apart. In doing so, it joins a long and distinguished line of highly-publicized rape cases that have fallen apart. Studies sometimes claim that only 2 to 8 percent of rape allegations are false. Yet the rate for allegations that go ultra-viral in the media must be an order of magnitude higher than this. As the old saying goes, once is happenstance, twice is coincidence, three times is enemy action.

The enigma is complicated by the observation that it’s usually feminist activists who are most instrumental in taking these stories viral. It’s not some conspiracy of pro-rape journalists choosing the most dubious accusations in order to discredit public trust. It’s people specifically selecting these incidents as flagship cases for their campaign that rape victims need to be believed and trusted. So why are the most publicized cases so much more likely to be false than the almost-always-true average case?

Scott goes on to hypothesize why this is: basically we are attracted to controversial stories because they allow us to signal our beliefs about different topics. I tend to believe he’s on to something, but for purposes of this series I want to emphasize his point that cases that get talked about are often more likely to contain extreme deception than regular every day cases. We have no reason to believe this is limited to rape cases, and every reason to believe that stories that grab headlines are uniquely unreliable.

Alright, with that out of the way, let’s move on to some stats issues!

Low Statistical Power

One issue that has likely contributed to the replication crisis is that many studies lack statistical power, which basically means a study doesn’t have enough data to reliably detect real effects. This basically makes the findings unstable, so when you repeat the study, the result might not appear again. Adequate statistical power is dependent on a few things, including sample size and the size of effect you’re looking to detect. For example, if you want to understand height differences between adult men and women, you might need a decent group before you can accurately say if the difference is 3 inches or 5 inches. If you’re looking at the height differences between adults and 5 year olds however, you’re going to need a much smaller group to establish there’s a huge difference. The smaller the effect size, the more people you need to reliably see what’s happening.

So how does this apply to true crime? Well, as I pointed out in part 2, most popular crime stories are highly unusual. While they are often things we deeply fear, they are almost always things we have no experience with. Given this lack of data, we have almost no basis for deciding what’s normal/abnormal, and yet we do it anyway! It’s a running joke on social media that every time a new subject comes up, people immediately switch from being infectious disease experts to nuclear war experts to trade agreement experts, etc. True crime is an extension of that, with people who have never experienced any part of the justice system loudly opining about what should or shouldn’t have been done. In the rush to get press coverage, I also noticed a lot of experts who did have experience in related fields would often comment on cases without actually having read all the details. I also consider this a lack of statistical power: all the general knowledge in the world doesn’t help if you don’t actually know the specifics of the case you’re talking about.

Positive Effect Size Bias

Otherwise known as the decline effect, many studies experience the phenomena of initially finding a large effect size that keeps getting smaller with each subsequent study. A classic example is medications, which often appear to work extremely well when they’re first rolled out, only to be much less impressive when studied after a few years.

I have seen this in a lot of true crime cases, where initially you are told “oh hey, you have to look at this absolutely CRAZY case they cover in this documentary”. If you look at the other side though, you gradually discover most of the things that hooked your attention are a lot more nuanced than they appeared. In our local case, there was one article that sparked all the interest and several years later someone went back and fact checked it. They estimated about 75% of it was proven incorrect and often laughably inaccurate. Bizarrely, people who got interested in the case didn’t seem to care that the thing that hooked them was so unreliable, they had simply moved on to new claims. Regardless of what you think happened in some case, it’s good to note when claims don’t hold up and not simply move on to new claims.

Problems of Meta-Analysis

One guardian against the replication crisis was supposed to be meta-anlyses, which take a lot of studies on the same topic and analyzes them together. A few issues with this is that one bad study can “infect” the whole meta-analysis, so even lumping a whole bunch of studies together doesn’t help. If you get one 6’2″ basketball player in your female height sample, it’s going to take a while for that average to come back to normal. Another issue is that if the hypothesis is wrong, you are not going to get studies with a strong effect in the opposite direction to balance things out, you are going to get studies that cluster around zero. Again, this means it will take a LOT of studies to show the real effect size.

So how does this work in true crime? Well, I actually think meta-analyses are the worst thing that can happen to a true crime case. Our justice system is supposed to be based on individual facts, not on group dynamics. This gets argued a lot with racial profiling, but perhaps my favorite example is family criminality. Crime is highly heritable, and yet our justice system doesn’t let your family history in to court, and for good reason. The foundation of our justice system is that you are supposed to be judged as an individual based on evidence, not on “well this would make sense”. True crime on the other hand is rife with this type of commentary. The police are always like this, people in small towns are like this, white rich kids are like this, etc etc etc. I actually am not very against stereotypes as a first step, but stereotypes are not evidence. If the evidence starts to contradict your stereotype, you may want to consider that someone might have been attempting to evoke exactly that stereotype to get you to override your reason.

P-hacking

I covered p-hacking back in part 4, where we talked about the idea of looking through tons of data for “surprising” connections. In both research and true crime, the more data you take in, the more likely you are to find connections that may or may not be meaningful. I did want to emphasize one more part of this though, something I’ll call “narrative hacking”. If p-hacking is when you overinterpret random connections, the narrative hacking is selectively including or emphasizing details, interpretations, or coincidences until a desired emotional or moral conclusion ‘feels significant.’. As I said to someone when talking about my local case, “some of what they complain about is real, some of it is just normal stuff said in a scary voice”. Selective interpretation of events is a normal human trait, and trying to make mundane things sound significant is a key trait of anyone trying to hook you on a story. Suddenly “weirdly, he never left the house all day” is said just the same as “oddly, he only left the house once that day” and “bizarrely, he left the house multiple times that day”. It’s good to be alert for when a narrator is emphasizing details that really aren’t that interesting.

Statistical heterogeneity

Statistical heterogeneity means that different studies of “the same” effect actually vary in methods, samples, measures, or contexts. What this means is that when you try to replicate a study, you can run in to the issue of changing something that actually was important to the study. For example, you might find an effect in a study done on all men that disappears if you add women to the sample, or a study on college students that doesn’t replicate to senior citizens. Sometimes slight wording in questions can radically change answers, etc etc. This can actually be an important issue to note, because sometimes it can show a previously hidden factor was influencing the original results.

In true crime, similar inputs do not always yield similar outputs. Two missing child cases can have very different reactions from parents, not because one is lying and the other isn’t, but because there’s a huge range of possible reactions to a horrible situation. This is somewhat akin to what I said above about overgeneralizations. There’s a huge range of crimes, contexts, and individuals involved, and even in a perfect system that would produce a huge range of human behavior. Trying to “follow” unusual tragic cases may lead to false confidence in your conclusions.

Alright, I think that’s all I have for today, tune in next week for what I’m hoping might be my last post before the wrap up, depending on how long winded I get. It’ll be fun!

To go straight to part 7, click here.

The True Crime Replication Crisis Part 5: Fraud

This week I have to say, we are getting to one of my favorite topics: straight up fraud. Prior to this we have covered a lot of things that can skew the thinking of otherwise good people despite their best efforts, which is the vast majority of issues we run in to, but today we’re going to cover those who intentionally deceived others. Even in the context of the replication crisis, straight up fraud cases make up a very small percentage of the concerns about research findings, but they are still worth focusing on as a potential trouble source.

Before we get started though, I want to mention a somewhat weird thing I’ve noticed over the past few years. I’ve noted that very often when it comes to research, people are often very quick to call human error and/or bias fraud, and then often too slow to call actual fraud, fraud. I have wondered why this is, and my suspicion is that it’s because well intentioned humans who make errors are often very ashamed and may not defend themselves as vigorously, whereas straight up fraudsters are extremely prepared to be challenged and are prepared to be aghast you would ever suspect them of anything. Thus the well intentioned error people seem a lot more “guilty looking” than the fight to death fraudsters.

So with that in mind, let’s talk about what fraud is and isn’t. Fraud is not making a mistake, even if it means you have to retract your study. Admitting you got it wrong and owning up to it is exactly what we want researchers to do. Fraud is also not publishing a faulty study you didn’t question rigorously enough because it matched your pre-existing beliefs, at this point most of us have accidentally share a link to a story that turned out to be false because it just “sounded true”. Fraud isn’t even necessarily only publishing certain outcomes in a study and failing to publish others. Many of these things can teeter towards fraud depending on the circumstances, but most people in their day to day lives will occasionally jump to conclusions or tell stories in ways that benefit them. It’s not a great human weakness, but it’s one we see often. So if those things aren’t fraud, what is? Well in the research world one of the main examples is data falsification. From making up numbers to pretending to have done experiments that never happened, this is an unfortunate reality of some research and it’s only through replication efforts that this can be uncovered.

The wildest example in the research world is actually fairly recent, the sordid tale of Francesca Gino. Gino was a Harvard Business School professor who, amazingly, specialized in “honesty and ethical behavior” research. Back in 2020, a graduate student raised concerns about one of her papers, and then tried to replicate it in 2021. She became suspicious that not only did the study fail to replicate, but the whole set of results seemed wildly implausible. She got some data bloggers involved, and things spiraled from there. To condense a very long story, Gino was eventually put on leave and ultimately her tenure was revoked and she was fired.

What’s interesting, given my second paragraph, is that this all came to the attention of most people because Gino sued both Harvard and the Data Colada bloggers for $25 million saying they were all defaming her. It was actually her own lawsuit that caused Harvard’s internal investigation of her to be released, which made her look incredibly bad. She has alternated between claiming she was the victim of sexism, that it’s all a big mistake and that she was framed. Her coauthors on the other hand, started a website to investigate all of the papers they’d worked with her on to make sure they knew which findings were reliable and which weren’t. While I will note that Gino has defenders still, it’s an interesting story of defensiveness in the face of accusations.

So how does this relate to true crime?

Well, I’d imagine much of the connection would be obvious, but I’d like to point out that in true crime we actually know pretty much from the get go that someone is actually straight up lying. In scientific research, fraud is always a possibility, but probably not more so than in regular human endeavors. It reminds me of the old stats 101 type problem, where you calculate things like “given that the child is a boy, what are the chances his name is John” vs “given that the child’s name is John, what are the chances that he is a boy” and they highlight those are wildly different answers. Here it’s the difference between “given that a scientist published a paper, what are the chances there is fabricated data?” and “given that a bunch of suspects have given different incompatible stories so someone is lying, what are the chances person X is lying”. Why do I point this out? Because as I mentioned at the beginning of this series, for some reason the average person I talk to is more open to hearing that a research study they heard about is wrong than they are to hearing the new true crime podcast they’re listening to is. This makes no sense because crime stories are almost by definition full of liars. One of the first types of lies little kids tell is lies to get themselves out of trouble. If you have even a passing familiarity with the Biblical story of the Garden of Eden, you’ll know that it’s alleged that the first crime humanity itself committed was to attempt to shift the blame for eating the apple. Lying about this stuff is as innate to human nature as it gets. So again, why are we so resistant to being skeptical about these stories when someone puts them on a podcast?

I think there’s a few things skewing our thinking with these. The first I think is that crimes tend to involve a lot of human error from the get go. Witnesses often don’t have the best memories of times/dates/sequences of events, so any attempt to call someone a liar has to be tempered with the frailty of human memory. Additionally, in many crimes, victims are purposely selected because they have pre-existing credibility issues, making things even harder to sort through. In the documentary about the fraudulent results from the Massachusetts state crime lab, a defense attorney notes that the number one risk factor for being falsely accused of a crime is already having a criminal record. Two fraudsters in two different labs got away with filing false drug test results for years in large part because the results mostly impacted known drug dealers.

Interestingly, this applies to any group who comes under fire. I don’t think it’s coincidental that the true crime genre exploded in popularity around the same time as George Floyd/Black Lives Matter gained steam, as “police framed/justice system railroaded innocent person” is perhaps the most popular true crime storyline. Just like having a criminal history does not make you automatically guilty of a crime, police having issues also does not negate the fact that nearly every defendant claims to be framed. There’s actually been some interesting discussion of this in defense attorney circles, with some attorneys arguing that all media that draws attention to the flaws of our justice system is useful, and some maintaining that this type of infotainment does more harm than good. Scott Greenfield, my personal favorite defense attorney/blogger, falls in the latter camp. For this post I went looking for his thoughts on True Crime and was interested that in the years since Serial debuted, he’s gotten even harsher than his initial skepticism. I’d recommend the whole thing, but I love his first three paragraphs that he wrote back in 2023 (bolding mine):

After the podcast Serial became a hit, the phone started ringing. The calls were from journalists, producers, wannabe podcasters, asking whether I had any cases involving a clearly innocent defendant who was abused by the system and ended up convicted and serving a lengthy sentence. Well, of course I did. We all do. But as it turned out, that really wasn’t the story they were interested in.

What they really wanted was a sympathetic defendant, the sort of innocent person people could love, and a simple, clear story of misconduct and abuse that ended with imprisonment. This was where I made the mistake. I had no stories like that, as few defendants were up for beatification before being charged with murder, and while there were arguments for the defense, and complex, messy problems along the way, it wasn’t as if the prosecution didn’t have a case to show they committed the murder.

The sort of post hoc contentions, like witnesses who recanted after they had nothing on the line or jailhouse snitches who say their cellies confessed to them, that true crime producers adored and thought critically valuable were the sort of things judges laughed off, as did I. People lie, all the time, for all sorts of reasons. Why is a post-trial recantation more credible than sworn trial testimony? Defendants bought witness silence or post-trial recantations on occasion. They often claimed innocence all along, even though they were guilty as sin. That’s the nature of criminal defense.

This is a man who makes his living in criminal defense pointing out the rather obvious fact that very few people get to trial without some pretty good evidence that they did it, and that people are constantly lying. If someone claims they lied under oath but now are telling the truth on a podcast, you may want to mark that person down in credibility. So what do we do here? Well, I think we have to approach these things with a huge eye towards fraud, both for the defendant and the podcaster. A few thoughts:

  1. Compare the story being told to different sources/established facts: I’ve said it before but I’ll say it again, before you start any documentary or podcast, look for a summary of the facts so you can tell if something’s being left out. Remember that every single person involved, from the defendant to the witnesses to the podcaster, is highly motivated to make themselves look as good as possible. It’s also good to note that wanting your story to be public in and of itself is not a sign of honesty, see my prior comments about Francesca Gino being the one to get her own damning internal investigation released. Some people truly believe facts make them look better than they do.
  2. Beware of emotional investment, your own or others: Over a 10 podcast series, you can feel you get to know the host/the subject/whoever, which can lead you to overattribute credibility to them and become less skeptical as time goes by. By the time you finish it can feel mentally awkward to consider someone you’ve come to like a liar. This goes double for podcaster by the way, especially if they got exclusive access to some of the players in the case. I have a rule of thumb that when someone covers a controversial case and interviews someone extensively, then starts hemming and hawing about their opinion while saying “I guess we’ll never know”, they think the person’s guilty. With my local case, we had at least one documentary film maker admit that’s exactly what they did. The burden of highlighting someone’s case just to condemn them is too much for some people.
  3. Beware of applying big picture thinking to individual cases: We live in a world where people get raped. This does not mean every individual rape accusation is true. We live in a world where people falsely accuse others of rape. This does not mean every single claim something is a false accusation is true. Unfortunately, there’s an odd thing that happens with true crime I’ll call “true crime as though experiment” where people use a true crime case as a stand in for a bigger issue. This can work in the research world, where research that suggests something similar to prior findings actually can be considered more credible than novel research. But in a crime case? The facts of every single crime still matter on their own. Once a case gets big enough though, a surprising number of people will claim the exact details don’t matter because we need a “bigger conversation”, but good lord imagine if it was you stuck in the middle of things? If your loved one was murdered and someone else decided to fudge some details and portray the murderer sympathetically because they wanted to make a “bigger point”? You’d hate it, we all would. Always consider there are real humans at the center of things, and ask if they signed up to be your morality tale.
  4. Remember, people do in fact just make things up and people have been hurt by it: One weird thing I’ve noticed with some true crime assessments is that people will try to play “fair” and give everyone equal credit, like they all are lying a little bit. I think this comes from our natural instinct when we’re adjudicating arguments in our personal life. If you have two friends in a fight and you hear both sides, our instinct tends to be to split the difference and assume both have some points and both are being a little self serving. With many crime stories though, some stories are just incompatible. This happened in my local case, and I was surprised how many people wanted to try to split the difference between two extremely incompatible claims. I ended up having more respect for those who went all in on one side or the other than those who tried to “both sides” two stories that clearly could not coexist.
  5. Factual innocence is different from not guilty: As I have throughout this series, I will reiterate that I support the reasonable doubt standard and our justice system. However, I continue to ding some true crime folk for acting like “beyond a reasonable doubt” means the defendant should be given more deference vs every other person involved. As we saw at the height of the #metoo era, a claim of wrongdoing that never enters a courtroom can destroy lives very easily. Not as dramatically as actually being wrongfully convicted in a court of law, but well beyond a level that’s reasonable to accept. It surprises me therefore that so many podcasters take this responsibility so lightly. If you know that one person committed a murder and you spend hours talking about 6 suspects, you should be aware 5 of those people are innocent and you may have just helped ruin their lives. Even Sarah Koenig admitted she’s ashamed of this part of the Serial podcast, that it encouraged people to treat others as pieces in a puzzle to be solved rather than humans who had been through pain. I get the reason people focus on the person on trial, meticulously cataloguing every issue with the case against them, but it’s notable they tend to spend just a few minutes on the weaknesses of the case against alternative suspects, if they mention them at all. This mimics the tactic of defense lawyers who are explicitly there to do this, but I’m surprised it doesn’t weigh heavier on the conscience of those just doing it as a hobby. If the defense lawyer was wrong, he did his job. If you’re wrong, you actually just wrecked somebody’s life for entertainment.
  6. Watch how people address the victims: This is a somewhat weird one, but hear me out: the more dismissive a true crime podcast or a suspect is of the loved ones of the deceased, specifically those who could not themselves be suspects, the more I’d question the story. Victims by definition shift the attention away from neatly crafted stories, and thus seem to prompt outsized anger or complete dismissal from those seeking to push a narrative. A good recent example of this is Candace Owens attacking Charlie Kirk’s widow Erika. Owens has stated there was a conspiracy to murder Kirk, and it seems the further she went with the story, the more the grieving widow not asking similar questions annoyed her. Even if you believed Charlie Kirk was killed as part of a bigger conspiracy (I don’t) raging at a young widow would be a weird place to start in making your case. Watch how people treat the undisputed victims, and you’ll get a good insight in to where their focus is.

Ok, that’s all I have for today! Tune in next week when I go over some statistical issues.

To go straight to part 6, click here.

Pink Sparkle Unicorn Science

I unfortunately have a packed weekend and have not been feeling well, so no True Crime post today. Instead, I would like to mention something I was wrong about.

For years I have disliked the whole “science for girls” thing, believing that it was fairly condescending to slap pink on something sciencey and to declare it “for girls”. I believed this right up until I had to buy a present for a precocious 4 year old girl I know who is obsessed with all things pink, sparkly, and unicorn adorned. It had been requested I try to find something sciencey, so I decided to take a chance with, well, a pink sparkle unicorn science kit for girls.

She loved it. Last I heard she had told one of her parents to “go away, I’m doing scientist things”.

Worth every penny.