5 Things About the GLAAD Accelerating Acceptance Report

This past week a reader contacted me to ask what I thought of a recent press release about a poll commissioned by GLAAD for their “Accelerating Acceptance” report. The report struck me as pretty interesting because the headlines mentioned that in 2017 there was a 4 point drop in LGBT acceptance, and I had actually just been discussing a Pew poll that showed a 7 point jump in the support for gay marriage in 2017. 

I was intrigued by this discrepancy, so I decided to take a look at the report (site link here, PDF here), particularly since a few of the articles I read about the whole things seemed a little confused about what it actually said. Here are 5 things I found out:

  1. The GLAAD report bases comfort/acceptance on reaction to seven different scenarios In order to figure out an overall category for each person, respondents were asked how comfortable they’d feel with seven different scenarios. The scenarios were things like “seeing a same sex couple holding hands” or “my child being assigned an LGBT teacher”. Interestingly, respondents were most likely to say they’d be uncomfortable if they found out their child was going to have a lesson in school on LGBT history (37%), and they were least likely to say they’d be uncomfortable if an LGBT person was at their place of worship (24%).
  2. The answers to those questions were used to assign people to a category Three different categories were assigned to people based on the responses they gave to the previous seven questions. “Allies” were respondents who said they’d be comfortable in all 7 situations. “Resisters” were those who said they’d be uncomfortable in all 7 situations. “Detached supporters” were those whose answers varied depending on the situation.
  3. It’s the “detached supporter” category that gained people this year. So this is where things got interesting. Every single question I mentioned in #1 saw an increase in the “uncomfortables” this year, all by 2-3%. While  that’s right at the margin of error for a survey this size (about 2,000 people), the fact that every single one went up by a similar amount give some credence to the idea that it’s an uptick. To compound that point, this was not driven by an uptick of people responding they were uncomfortable in every situation, but actually more people saying they were uncomfortable in some situations but not others:
  4. The percent of gay people reporting discrimination has gone up quite a bit. Given the headlines, you’d think the biggest finding of this study would be the drop in the number of allies for LGBT people, but I actually thought the most striking finding was the number of LGBT people who said they had experienced discrimination. That went from 44% in 2016 to 55% in 2017, which was a bigger jump than other groups: That red box there is the only question I ended up with. Why is the 27% so small? Given that I saw no other axis/scale issues in the report, I wondered if that was a typo. Not the biggest deal, but curiosity inducing nonetheless.
  5. Support for equal rights stayed steady For all the other findings, it was interesting to note that 79% of people continue to say they support equal rights for LGBT people. This number has not changed.

So overall, what’s going on here? Why is support for gay marriage going up, support for equal rights unchanged, but discrimination reports going up and individual comfort going down? I have a few thoughts.

First, for the overall “comfort” numbers, it is possible that this is just a general margin of error blip. The GLAAD survey only has 4 years of data, so it’s possible that this is an uptick with no trend attached. Pew Research has been tracking attitudes about gay marriage for almost 20 years, and they show a few years where a data point reversed the trend, only to change the next year. A perfectly linear trend is unlikely.

Second, in a tense political year, it is possible that different types of people pick up the phone to answer survey questions. If people are reporting similar or increased levels of support for concrete things (like legal rights) but slightly lower levels of comfort around people themselves, that may be a reflection of the polarized nature of many of our current political discussions. I know my political views haven’t changed much in the past 18 months, but my level of comfort around quite a few people I know has.

Third, there very well could be a change in attitudes going on here. One data point does not make a trend, but every trend starts with a data point. I’d particularly be interested in drilling in to those discrimination numbers to see what types of discrimination were on the uptick. Additionally, the summary report mentions that they’ve changed some of the wording (back in 2016) to make it clearer that they were asking about both LGB and T folks, which makes me wonder if the discrimination is different between those two groups. I wasn’t clear from the summary if they had separate answers for each or if they just mentioned each group specifically, so I could be wrong about what data they have here.

Regardless, the survey for next year should shed some light on the topic.

5 Things About the Perfect Age

When people ask me to explain why I got degrees in both family therapy and statistics, my go to answer is generally that “I like to think about how numbers make people feel.” Given this, I was extremely interested to see this article in the Wall Street Journal this weekend, about researchers who are trying to figure out what people consider the “perfect” age.

I love this article because it’s the intersection of so many things I could talk about for hours: perception, biases, numbers, self-reporting, human development, and a heavy dose of self-reflection to boot.

While the researchers haven’t found any one perfect age, they do have a lot of thought provoking commentary:

  1. The perfect age depends on your definition of perfect Some people pick the year they had the most opportunities, some the year they had the most friends, some the years they had the most time, others the year they were the happiest, and other the years they had a lot to reflect on. Unsurprisingly, different definitions lead to different results.
  2. Time makes a difference Unsurprisingly, young people (college students) tend to say if they could freeze themselves at one age, it would be sometime in their 20s. Older people on the other hand name older ages….50 seems pretty popular. This makes sense as I suspect most people who have kids would pick to freeze themselves at a point where those kids were around
  3. Anxiety is concentrated to a few decades One of the more interesting findings was that worry and anxiety were actually most present between 20 and 50.  After 50, well-being actually climbed until age 70 or so. The thought is that generally that’s when the kids leave home and people start to have more time on their hands, but before the brunt of major health problems hits.
  4. Fun is also concentrated at the beginning and end of the curve Apparently people in the 65 to 74 age range report having the most fun of any age range, with 35 to 54 year olds having the least. It’s interesting that we often think of young people as having the “fun” advantage due to youth and beauty, but apparently the “confusion about life” piece plays a big part in limiting how fun those ages feel. Sounds about right.
  5. How stressed you are in one decade might dictate how happy you are in the next one This is just me editorializing, but all of this research really makes me wonder how our stress in one decade impacts the other decades. For example, many parents find the years of raising small children rather stressful and draining, but that investment may pay off later when their kids are grown. Similar things are true of work and other “life building” activities. Conversely, current studies show that men in their 20s who aren’t working report more happiness than those in their cohort who are working….but one suspects by age 40 that trend may have reversed. You never know what life will throw at you, but even the best planned lives don’t get their highs without some work.

Of course after thinking about all this, I had to wonder what my perfect age would be. I honestly couldn’t come up with a good answer to this at the moment, especially based on what I was reading. 50 seems pretty promising, but of course there’s a lot of variation possible between now and then. Regardless, a good example of quickly shifting opinions, and how a little perspective tweak can make a difference.

5 Things to Know About Hot Drinks and Esophageal Cancer

Fun fact: according to CNN, on New Year’s Day 90% of the US never got above freezing.

Second fun fact: on my way in to work this morning I passed an enormous fire burning a couple hundred yards from where the train runs. I Googled it to see what was happened and discovered it was a gas main that caught on fire, and they realized that shutting the gas off (normal procedure I assume) would have made thousands of people in the area lose heat. With temps hitting -6F, they couldn’t justify the damage so they let the fire burn for two days while they figured out another way of putting it out.

In other words, it’s cooooooooooold out there.

With a record cold snap on our hands and the worst yet to come this weekend, I’ve been spending a lot of time warming up. This means a lot of hot tea and hot coffee have been consumed, which reminded me of a factoid I’d heard a few months ago but never looked in to. Someone had told me that drinking hot beverages was a risk factor for esophageal cancer, but when pressed they couldn’t tell me what was meant by “hot” or how big the risk was. I figured this was as good a time as any to look it up, though I was pretty sure nothing I read was going to change my behavior. Here’s what I found:

  1. Hot means HOT When I first heard the hot beverage/cancer link, my first thought was about my morning coffee. However, I probably don’t have to worry much. The official World Health Organization recommendation is to avoid drinking beverages that are over 149 degrees F. In case you’re curious, Starbucks typically servers coffee at 145-165 degrees, and most of us would wait for it to cool for a minute before we drank it.
  2. Temperature has a better correlation with cancer than beverage type So why was anyone looking at beverage temperature as a possibly carcinogen to begin with? Seems a little odd, right? Well it turns out most of these studies were done in part to rule out that it was the beverage itself that was causing cancer. For example, quite a few of the initial studies noted that people who drank traditional Yerba Mate had higher esophageal cancer rates than those who didn’t. The obvious hypothesis was that it was the Yerba Mate  itself that was causing cancer, but then they noted that repeated thermal injury due to scalding tea was also a possibility. By separating correlation and causation, it was determined that those who drink Yerba Mate (or coffee or other tea) at lower temperatures did not appear to have higher rates of esophageal cancer. Nice work guys.
  3. The risk has been noted in both directions So how big a risk are we looking at? A pretty sizable one actually. This article reports that hot tea drinkers are 8 times as likely to get esophageal cancer as those who drink tea at lower temperatures, and those who have esophageal cancer are twice as likely to say they drank their tea hot before they got cancer. When assessing risk, knowing both those numbers is important to establish a strong link.
  4. The incidence rate seems to be higher in countries that like their beverages hot It’s interesting to note that the US does not even come close to having the highest esophageal cancer rates in the world. Whereas our rate is about 4.2 per 100,000 people, countries like  Malawi have rates of 24.2 per 100,000 people. Many of the countries that have high rates have traditions of drinking scalding hot beverages, and it’s thought that combining that with other risk factors (smoking, alcohol consumption, poverty and poorly developed health care systems) could have a compounding effect. It’s not clear if scalding your throat is a risk in and of itself or if it just makes you more susceptible to other risks, but either way it doesn’t seem to help.
  5. There is an optimum drinking temperature According to this paper, to minimize your risk while maximizing your enjoyment, you should serve your hot beverages at exactly 136 degrees F. Of course a lot of that has to do with how quickly you’ll drink it and what the ambient temperature is. I was pretty impressed with my Contigo thermos for keeping my coffee pretty hot during my 1.5 mile walk from the train station in -3 degrees F this morning, but lesser travel mugs might have had a problem with that. Interestingly I couldn’t find a good calculator to track how fast your beverage will cool under various conditions, but if you find one send it my way!

Of course if you really want to cool a drink down quickly, just move to Fairbanks, Alaska and throw it in the air:

Stay warm everyone!

5 Interesting Resources for Snowflake Math Lessons

Happy National Make a Paper Snowflake Day (or National Make Cut Out Snowflakes Day for the purists)!

I don’t remember why I stumbled on this holiday this year, but I thought it would be a really good time to remind everyone that snowflakes are a pretty cool (no pun intended) basis for a math lesson. My sister-in-law teaches high school math and informs me that this is an excellent thing to give kids to do right before winter break. I’m probably a little late for that, but just in case you’re looking for some resources, here are some good ones I’ve found:

  1. Khan Academy Math for Fun and Glory  If you ever thought the problem with snowflake cutting is that it wasn’t technical enough, then this short video is for you. Part of a bigger series that is pretty fun to work through, this video is a great intro to how to cut a mathematically/anatomically(?) correct snowflake.
  2. Computer snowflake models There’s some interesting science behind computer snowflake models, and this site takes you through some of the most advanced programs for doing so. It seems like a fun exercise, but apparently modeling crystal growth has some pretty interesting applications. Gallery of images here, and an overview of the mathematical models here.
  3. Uniqueness of snowflakes Back in the real world, there’s an interesting and raging debate over the whole “no two snowflakes are alike” thing. According to this article,  “Yes, with a caution”, “Likely but unprovable” or “it depends on what you mean by unique” are all acceptable answers.
  4. Online snowflake maker If you’re desperate to try out some of the math lessons you just learned but can’t find your scissors, this online snowflake generator has you covered.
  5. Other winter math If you’re still looking for more ideas, check out this list of winter related math activities. In addition to snowflake lessons around symmetry, patterns and Koch snowflakes, they have penguin and snowman math.

Happy shoveling!

 

5 Things About Personality and Cold Weather

As I mentioned on Sunday, I’ve been itching to do a deep dive in to this new paper about how people who grow up in cold regions tend to have different personalities than those who don’t. As someone who grew up in the New England area, it’s pretty striking to me how every warmer weather city in the US seems more outgoing than what I’m used to. Still, despite my initial belief I was curious how one goes about proving that people in cold-weather cities are less agreeable. While the overall strategy is pretty simple (give personality tests to different people in different climates, compare answers) I figured there’d likely be some interesting nuance I’d be interested in.

Now that I’ve finally read the paper, here’s what I found out:

  1. To make the findings more broadly applicable, study multiple countries One of the first things I noticed when I pulled up the paper is that there were a surprising number of Chinese names among the author list. I had assumed this was just a US based study, but it turns out it was actually a cross-cultural study using both the US and China for data sets. This makes the findings much stronger than they would be otherwise.
  2. There are 3 possible mechanisms for climate effecting personality I’ve talked about the rules for proving causality before, and the authors wasted no time in introducing a potential mechanism to explain a cold weather/agreeableness link. There are three main theories: people in cold weather were more likely to be herders which requires less cooperation than farming or fishing, people in cold weather are more susceptible to pathogens so they unconsciously avoid each other, and people may migrate to areas that fit their (group) personalities. Thus, it’s possible that the cold doesn’t make people disagreeable, but rather that disagreeable people move to cold climates. Insert joke about Bostonians here.
  3. The personality difference were actually present for every one of the Big 5 traits. Interestingly, every one of the Big 5 personality traits was higher in those who lived in nicer climates: extraversion, agreeableness, openness to new experience, conscientiousness and emotional stability. The difference in agreeableness was not statistically significant for the Chinese group. Here are the differences, along with what variables appear to have made a difference (note: “temperature clemency” means how far off the average temperature is from  72 degrees):
  4. Reverse causality was controlled for One of the interesting things about the findings is that the authors decided to control for the factors listed in #2 to determine what was causing what. They specifically asked people about where they grew up to control for selective (adult) migration, and in the Chinese part of the study actually asked about prior generations as well. They controlled for things like influenza incidence (as a proxy for pathogen presence) as well. Given that the finding persisted after these controls, it seems more likely that weather causes these other factors.
  5. Only cold climates were examined One of the more interesting parts of this to me is what wasn’t studied: uncomfortably warm temperatures. Both China and the US are more temperate to the south and colder to the north. The “temperature clemency” variable looked specifically at temperatures that deviated from 72 degrees, but only in the low temperature direction. It would be interesting to see what unreasonably hot temperatures did to personalities….is it a linear effect? Do some personality traits drop off again? I’d be curious.

Overall I thought this was an interesting study. I always appreciate it when multiple cultures are considered, and I thought the findings seemed pretty robust. Within the paper and in the notes at the end, the authors repeatedly mentioned that they tried most of the calculations a few different ways to make sure that their findings were robust and didn’t collapse with minor changes. That’s a great step in the right direction for all studies. Stay warm everyone!

5 Interesting Things About IQ Self-Estimates

After my post last week about what goes wrong when students self-report their grades, the Assistant Village Idiot left a comment wondering about how this would look if we changed the topic to IQ. He wondered specifically about Quora, a question asking/answering website that has managed to spawn its own meta-genre of questions asking “why is this website so obsessed with IQ?“.

Unsurprisingly, there is no particular research done on specific websites and IQ self-reporting, but there is actually some interesting literature on people’s ability to estimate their own IQ and that of those around them. Most of this research comes from a British researcher from the University College London, Adrian Fuhrman.  Studying how well people actually know themselves kinda sounds like a dream job to me, so kudos to you Adrian. Anyway, ready for the highlights?

  1. IQ self estimates are iffy at best One of the first things that surprised me about IQ self-estimates vs actual IQ was how weak the correlation was. One study found an r=.3, another r=.19.  This data was gathered from people who first took a test, then were asked to estimate their results prior to actually getting them. In both cases, it appears that people are sort of on the right track, but not terrific at pinpointing how smart they are. One wonders if this is part of the reason for the IQ test obsession….we’re rightfully insecure about our ability to figure this out on our own.
  2. There’s a gender difference in predictions Across cultures, men tend to rank their own IQ higher than women do, and both genders consistently rank their male relatives (fathers, grandfathers and sons) as smarter than their female relatives (mothers, grandmothers and daughters). This often gets reported as male hubris vs female humility (indeed, that’s the title of the paper), but I note they didn’t actually compare it to results. Given that many of these studies are conducted on psych undergrad volunteers, is it possible that men are more likely to self select when they know IQ will be measured? Some of these studies had average IQ guesses of 120 (for women) and 127 (for men)….that’s not even remotely an average group, and I’d caution against extrapolation.
  3. Education may be a confounding factor for how we assess others One of the other interesting findings in the “rate your family member” game is that people rank previous generations as half a standard deviation less intelligent than they rank themselves. This could be due to the Flynn effect, but the other suggestion is that it’s hard to rank IQ accurately when educational achievement is discordant. Within a cohort, education achievement is actually pretty strongly correlated with IQ, so re-calibrating for other generations could be tricky.  In other words, if you got a master’s degree and your grandmother only graduated high school, you may think your IQ is further apart than it really is. To somewhat support this theory, as time has progressed, the gap between self rankings and grandparent rankings has closed. Interesting to think how this could also effect some of the gender effects seen in #2, particularly for prior generations.
  4. Being smart may not be the same as avoiding stupidity One of the more interesting studies I read looked at the correlation between IQ self-report and personality traits, and found that some traits made your more likely to think you had a high IQ. One of these traits was stability, which confused me because you don’t normally think of stable people as being overly high on themselves. When I thought about it for a bit though, I wonder if stable people were defining being “smart” as “not doing stupid things”.  Given that many stupid actions are probably more highly correlated with impulsiveness (as opposed to low IQ), this could explain the difference. I don’t have proof, but I suspect a stable person A with an IQ of 115 will mostly do better than an unstable person B with an IQ of 115, but person A may attribute this difference to intelligence rather than impulse control. It’s an academic distinction more than a practical one, but it could be confusing things a bit.
  5. Disagreeableness is associate with higher IQs, and self-perception of higher IQs  Here’s an interesting chicken and egg question for you: does having a high IQ make you more disagreeable or does being disagreeable make you think you have a higher IQ? Alternative explanation: is some underlying factor driving both? It turns out having a high IQ is associate both with being disagreeable and being disagreeable is associated with ranking your IQ as higher than others. This probably effects some of the IQ discussions to a certain degree….the “here’s my high IQ now let’s talk about it” crowd probably really is not as agreeable as those who want to talk about sports or exchange recipes.

So there you have it! My overall impression from reading this is that IQ is one of those things where people don’t appreciate or want to acknowledge small differences. In looking at some of the studies of where people ranking their parents against each other, I was surprised how many were pointing to a 15 point gap between parents, or a 10 point gap between siblings. Additionally, it’s interesting that we appear to have a pretty uneasy relationship with IQ tests in general. Women in the US for example are more likely to take IQ tests than men are but less likely to trust their validity. To confuse things further, they are also more likely to believe they are useful in educational settings. Huh? I’d be interested to see a self-estimated IQ compared to an actual understanding of what IQ is/is not, and then compare that to an actual scored IQ test. That might flesh out where some of these conflicting feelings were coming from.

5 Things You Should Know About Orchestras and Blind Auditions

Unless you were going completely off the grid this week, you probably heard about the now-infamous “Google memo“.  Written by a (since fired) 28 year old software engineer at Google, the memo is a ten page long document where the author lays out his beliefs about why gender gaps in tech fields continue to exist. While the author did not succeed in getting any policies at Google changed, he did manage to kick off an avalanche of hot takes examining whether the gender/tech gap is due to nature (population level differences in interests/aptitude) or nurture (embedded social structures that make women unwelcome in certain spaces). I have no particular interest in adding another take to the pile, but I did see a few references to the “blind orchestra auditions study” that reminded me I had been wanting to write about that one for a while, to deep dive in to a few things it did or did not say.

For those of you who don’t know what I’m talking about, here’s the run down: back in the 1970s, top orchestras in the US were 5% female. By the year 2000, the were up to almost 30% female. Part of the reason for the change was the introduction of “blind auditions”, where the people who were holding tryouts couldn’t see the identity of the person trying out. This finding normally gets presented without a lot of context, but it’s good to note someone actually did decided to study this phenomena to see if the two things really were related or not. They got their hands on all of the tryout data for quite a few major orchestras (they declined to name which ones, as it was part of the agreement of getting the data) and tracked what happened to individual musicians as they tried out. This led to a data set that had overall population trends, but also could be used to track individuals. You can download the study here, but these are my highlights:

  1. Orchestras are a good place to measure changing gender proportions, because orchestra jobs don’t change. Okay, first up is an interesting “control your variables” moment. One of the things I didn’t realize about orchestras (though may be should have) is that the top ones have not changed in size or composition in years. So basically, if you suddenly are seeing more women, you know it’s because the proportion of women overall is increasing across many instruments. In the words of the authors ” An increase in the number of women from, say, 1 to 10, cannot arise because the number of harpists (a female-dominated instrument), has greatly expanded. It must be because the proportion female within many groups has increased.”
  2. Blind auditions weren’t necessarily implemented to cut down on sexism. Since this study is so often cited in the context of sexism and bias, I had not actually ever read why blind auditions were implemented in the first place. Interestingly, according to the paper written about it, the actual initial concern was nepotism. Basically, orchestras were filled with their conductors students, and other potentially better players were shut out. When they opened the auditions up further, they discovered that when people could see who was auditioning, they still showed preferential treatment based on resume. This is when they decided to blind the audition, to make sure that all preconceived notions were controlled for. The study authors chose to focus on the impact this had on women (in their words) “Because we are able to identify sex, but no other characteristics for a large sample, we focus on the impact of the screen on the employment of women.”
  3. Blinding can help women out Okay, so first up, the most often reported findings: blind auditions appear to account for about 25% of the increase in women in major orchestras. When they studied individual musicians, they found that women who tried out in blind and non-blind auditions were more successful in the blinded auditions. They also found that having a blind final round increased the chances a woman was picked by about 33%. This is what normally gets reported, and it is a correct reporting of the findings.
  4. Blinding doesn’t always help women out One of the more interesting findings of the study that I have not often seen reported: overall, women did worse in the blinded auditions. As I mentioned up front, the study authors had the data for groups and for individuals, and the findings from #3 were pulled from the individual data. When you look at the group data, we actually see the opposite effect. The study authors suggest one possible explanation for this: adopting a “blind” process dropped the quality of the female candidates. This makes a certain amount of sense. If you sense you are a borderline candidate, but also think there may be some bias against you, you would be more likely to put your time in to an audition where you knew the bias factor would be taken out. Still, that result interested me.
  5. The effects of blinding can depend on the point in the process Even after controlling for all sorts of factors, the study authors did find that bias was not equally present in all moments. For example, they found that blind auditions seemed to help women most in preliminary and final rounds, but it actually hurt them in the semi-final rounds. This would make a certain amount of sense….presumably people doing the judging may be using different criteria in each round, and some of those may be biased in different ways than others. Assuming that all parts of the process work the same way is probably a bad assumption to make.

Overall, while the study is potentially outdated (from 2001…using data from 1950s-1990s), I do think it’s an interesting frame of reference for some of our current debates. One article I read about it talked about the benefit of industries figuring out how to blind parts of their interview process because it gets them to consider all sorts of different people….including those lacking traditional educational requirements. With many industries dominated by those who went to exclusive schools, hiding identity could have some unexpected benefits for all sorts of people. However, as this study also shows, it’s probably a good idea to keep the limitations of this sort of blinding in mind. Even established bias is not a consistent force that produces identical outcomes at all time points, and any measure you institute can quickly become a target that changes behavior.  Regardless, I think blinding is a good thing. All of us have our own pitfalls, and we all might be a little better off if we see our expectations toppled occasionally.

4 Examples of Confusing Cross-Cultural Statistics

In light of my last post about variability in eating patterns across religious traditions, I thought I’d put together a few other examples of times when attempts to compare data across international borders got a little more complicated than you would think.

Note: not all of this confusion changed the conclusions that people were trying to get to, but it did make things a little confusing.

  1. Who welcomes the refugee  About a year or so ago, when Syrian refugees were making headlines, there was a story going around that China was the most welcoming country for people fleeing their homeland. The basis of the story was an Amnesty International survey that showed a whopping 46% of Chinese citizens saying they would be willing to take a refugee in to their home…..far more than any other country. The confusion arose when a Quartz article pointed out that there is no direct Chinese translation for the word “refugee” and the word used in the survey meant “person who has suffered a calamity” without clarifying whether that person is international or lives down the street. It’s not clear how this translation may have influenced the response, but a different question on the same survey that made the “international” part clearer received much lower support.
  2. The French Paradox (reduced by 20%) In the process of researching my last post, I came across a rather odd tidbit I’d never heard of before regarding the “French Paradox”. A term that originated in the 80s, the French Paradox is the apparent contradiction that French people eat lots of cholesterol/saturated fat and yet don’t get heart disease at the rates you would expect based on data from other countries. Now I had heard of this paradox before, but the part I hadn’t heard  was the assertion that French doctors under-counted deaths from coronary heart disease. When you compared death certificates to data collected by more standardized methods, they found that this was true:

    They suspect the discrepancy arose because doctors in many countries automatically attribute sudden deaths in older people to coronary heart disease, whereas the French doctors were only doing so if they had clinical evidence of heart disease. This didn’t actually change the rank of France very much; they still have a lower than expected rate of heart disease. However, it did nearly double the reported incidence of CHD and cuts the paradox down by about 20%.

  3. Crime statistics of all sorts This BBC article is a few years old, but it has some interesting tidbits about cross-country crime rate comparisons. For example, Canada and Australia have the highest kidnapping rates in the world. The reason? They count all parental custody disputes as kidnappings, even if everyone knows where the child is. Other countries keep this data separate and only use “kidnapping” to describe a missing child. Countries that widen their definitions of certain crimes tend to see an uptick in those crimes, like Sweden saw with rape when it widened its definition in 2005.
  4. Infant mortality  This World Health Organization report has some interesting notes about how different countries count infant mortality, and it notes that some countries (such as Belgium, France and Spain) only count infant mortality in infants who survive beyond a certain time period after birth, such as 24 hours. Those countries tend to have lower infant mortality rates but higher stillbirth rates than countries that don’t set such a cutoff. Additionally, as of 2008 approximately 3/4 of countries lack the infrastructure to count infant mortality through hospitals and do so through household surveys instead.

Like I said, not all of these change the conclusions people come to, but they are good things to keep in mind.

5 Things You Should Know About the “Backfire Effect”

I’ve been ruminating a lot on truth and errors this week, so it was perhaps well timed that someone sent me this article on the “backfire effect” a few days ago. The backfire effect is a name given to a psychological phenomena in which attempting to correct someone’s facts actually increases their belief in their original error. Rather than admit they are wrong when presented with evidence they narrative goes, people double down. Given the current state of politics in the US, this has become a popular thing to talk about. It’s popped up in my Facebook feed and is commonly cited as the cause of the “post-fact” era.

So what’s up with this? Is it true that no one cares about facts any more? Should I give up on this whole facts thing and find something better to do with my time?

Well, as with most things, it turns out it’s a bit more complicated than that. Here’s a few things you should know about the state of this research:

  1. The most highly cited paper focused heavily on the Iraq War The first paper that made headlines was from Nyhan and Reifler back in 2010, and was performed on college students at a Midwest Catholic University. They presented some students with stories including political misperceptions, and some with stories that also had corrections. They found that the students that got corrections were more likely to believe the original misperception. The biggest issue this showed up with was whether or not WMDs were found in Iraq. They also tested facts/corrections around the tax code and stem cell research bans, but it was the WMD findings that grabbed all the headlines. What’s notable is that the research was performed in 2005 and 2006, when the Iraq War was heavily in the news.
  2. The sample size was fairly small and composed entirely of college students One of the primary weaknesses of the first papers (as stated by the authors themselves) is that 130 college students are not really a representative sample. The sample was half liberal and 25% conservative. It’s worth noting that they believe that was a representative sample for their campus, meaning all of the conservatives were in an environment where they were the minority. Given that one of the conclusions of the paper was that conservatives seemed to be more prone to this effect than liberals, it’s an important point.
  3. A new paper with a broader sample suggest the “backfire effect” is actually fairly rare. Last year, two researchers (Porter and Wood) polled 8,100 people from all walks of life on 36 political topics and found…..WMDs in Iraq were actually the only issue that provoked a backfire effect. A great Q&A with them can be found here. This is fascinating if it holds up because it means the original research was mostly confirmed, but any attempt at generalization was pretty wrong.
  4. When correcting facts, phrasing mattered One of the more interesting parts of the Porter/Wood study was when the researchers described how they approached their corrections. In their own words “Accordingly, we do not ask respondents to change their policy preferences in response to facts–they are instead asked to adopt an authoritative source’s description of the facts, in the face of contradictory political rhetoric“. They reject heartily “corrections” that are aimed at making people change their mind on a moral stance (like say abortion) and focus only on facts. Even with the WMD question they found that the more straightforward and simple the correction statement, the more people of all political persuasions accepted it.
  5. The 4 study authors are now working together In an exceptionally cool twist, the authors who came to slightly different conclusions are now working together. The Science of Us gives the whole story here, but essentially Nyhan and Reifler praised Porter and Wood’s work, then said they should all work together to figure out what’s going on. They apparently gathered a lot of data during the height of election season and hopefully we will see those results in the near future.

I think this is an important set of points, both because it’s heartwarming (and intellectually awesome!) to see senior researchers accepting that some of their conclusion may be wrong and actually working with others to improve their own work. Next, I think it’s important because I’ve heard a lot of people in my personal life commenting that “facts don’t work” so they basically avoid arguing with those who don’t agree with them. If it’s true that facts DO work as long as you’re not focused on getting someone to change their mind on the root issue, then it’s REALLY important that we know that. It’s purely anecdotal, but I can note that this has been my experience with political debates. Even the most hardcore conservatives and liberals I know will make concessions if you clarify you know they won’t change their mind on their moral stance.

5 Things You Should Know About Statistical Process Control Charts

Once again I outdo myself with the clickbait-ish titles, huh? Sorry about that, I promise this is actually a REALLY interesting topic.

I was preparing a talk for a conference this week (today actually, provided I get this post up when I plan to), and I realized that statistical process control charts (or SPC charts for short) are one of the tools I use quite often at work but don’t really talk about here on the blog. Between those and my gif usage, I think you can safely guess why my reputation at work is a bit, uh, idiosyncratic. For those of you who have never heard of an SPC chart, here’s a quick orientation. First, they look like this:

(Image from qimacros.com, and excellent software for generating these)

The chart is used for plotting something over time….hours, days, weeks, quarters, years, or “order in line”…take your pick.  Then you map some ongoing process or variable you are interested in…..say employee sick calls. You measure employee sick calls in some way (# of calls or % of employees calling in) in each time period. This sets up a baseline average, along with “control limits”, which are basically 1, 2 and 3 standard deviation ranges. If at some point your rate/number/etc starts to go up or down, the SPC chart can tell you if the change is significant or not based on where it falls on the plot.  For example, if you have one point that falls outside the 3 standard deviation line, that’s significant. If two in a row fall outside the 2 standard deviation line, that’s significant as well. The rules for this vary by industry, and Wiki gives a pretty good overview here. At the end of this exercise you have a really nice graph of how you’re doing with a good visual of any unusual happenings, all with some statistical rigor behind it. What’s not to love?

Anyway, I think because they take a little bit of getting used to,  SPC charts do not always get the love they deserve. I would like to rectify this travesty, so here’s 5 things you should know about them to tempt you to go learn more about them:

  1. SPC charts are probably more useful for most business than hypothesis testing While most high school level statistics classes at least take a stab at explaining p-values and hypothesis testing to kids, almost none of them even show an example of a control chart. And why not? I think it’s a good case of academia favoring itself. If you want to test a new idea against an old idea or to compare two things at a fixed point in time p-values and hypothesis testing are pretty good. That’s why they’re used in most academic research. However, if you want see how things are going over time, you need statistical process control. Since this is more relevant for most businesses, people who are trying to keep track of any key metric should DEFINITELY know about these.   Six Sigma and many process improvement class teach statistical process control, but they still don’t seem widely used outside of those settings. Too bad. These graphs are  practical, they can be updated easily, and it gives you a way of monitoring what’s going on and lot of good information about how your process are going. Like what? Well, like #2 on this list:
  2. SPC charts track two types of variation Let’s get back to my sick call example. Let’s say that in any given month, 10% of your employees call in sick. Now most people realize that not every month will be exactly 10%. Some months it’s 8%, some months it’s 12%. What statistical process control charts help calculate is when those fluctuations are most likely just random (known as common cause variation) and the point at which they are probably not so random (special cause variation). It sets parameters that tell you when you should pay attention. They are better than p-values for this because you’re not really running an experiment every month….you just want to make sure everything’s progressing as it usually does. The other nice part is this translates easily in to a nice visual for people, so you can say with confidence “this is how it’s always been” or “something unusual is happening here” and have more than your gut to rely on.
  3. SPC charts help you test new things, or spot concerning trends quickly SPC charts were really invented for manufacturing plants, and were perfected and popularized in post-WWII Japan. One of the reasons for this is that they really loved having an early warning about when a machine might be breaking down or an employee might not be following the process. If the process goes above or below a certain red line (aka the “upper/lower control limit”) you have a lot of confidence something has gone wrong and can start investigating right away. In addition to this, you can see if a change you made helps anything. For example, if you do a handwashing education initiative, you can see what percentage of your employees call in sick the next month. If it’s below the lower control limit, you can say it was a success, just like with traditional p-values/hypothesis testing. HOWEVER, unlike p-values/hypothesis testing, SPC charts make allowances for time. Let’s say you drop the sick calls to 9% per month, but then they stay down for 7 months. Your SPC chart rules now tell you you’ve made a difference. SPC charts don’t just take in to account the magnitude of the change, but also the duration. Very useful for any metric you need to track on an ongoing basis.
  4. They encourage you not to fix what isn’t broken One of the interesting reasons SPC charts caught on so well in the manufacturing world is that the idea of “opportunity cost” was well established. If your assembly line puts out a faulty widget or two, it’s going to cost you a lot of money to shut the whole thing down. You don’t want to do that unless it’s REALLY broken. For our sick call example, it’s possible that what looks like an increase (say to 15% of your workforce) isn’t a big deal and that trying to interfere will cause more harm than good. Always good to remember that there are really two ways of being wrong: missing a problem that does exist, and trying to fix one that doesn’t.
  5. There are quite a few different types One of the extra nice things about SPC charts is that there are actually 6 types to chose from, depending on what kind of data you are working with. There’s a helpful flowchart to pick your type here, but a good computer program (I use QI macros) can actually pick for you. One of the best parts of this is that some of them can deal with small and varying sample sizes, so you can finally show that going from 20% to 25% isn’t really impressive if you just lowered your volume from 5 to 4.

So those are some of my reasons you should know about these magical little charts. I do wish they’d get used more often because they are a great way of visualizing how you’re doing on an ongoing basis.

If you want to know more about the math behind them and more uses (especially in healthcare), try this presentation. And wish me luck on my talk! Pitching this stuff right before lunch is going to be a challenge.