Tidal Statistics

I’m having a little too much fun lately with my “name your own bias/fallacy/data error” thing, so I’ve decided I’m going to make it a monthly-ish feature. I’m gathering the full list up under the “GPD Lexicon” tab.

For this month, I wanted to revisit a phrase I introduced back in October: buoy statistic. At the time I defined the term as:

Buoy statistic: A statistic that is presented on its own as free-floating, while the context and anchoring data is hidden from initial sight.

This was intended to cover a pretty wide variety of scenarios, such as when we hear things like “women are more likely to do thing x” without being told that the “more likely” is 3 percentage points over men.

While I like this term, today I want to narrow it down to a special subcase: tidal statistics. I’m defining those as…..

Tidal Statistic: A metric that is presented as evidence of the rise or fall of one particular group, subject or issue, during a time period when related groups also rose or fell on the same metric

So for example, if someone says “after the CEO said something silly, that company’s went down on Monday” but they don’t mention that the whole stock market went down on Monday, that’s a tidal statistic. The statement by itself could be perfectly true, but the context changes the meaning.

Another example: recently Vox.com did an article about racial segregation in schools in which they presented this graph:

Now this graph initially caught my eye because they had initially labeled it as being representative of the whole US (they later went back and corrected it to clarify that this was just for the south), and I started to wonder how this was impacted by changing demographic trends. I remembered seeing some headlines a few years back that white students were now a minority-majority among school age children, which means at least some of that drop is likely due a decrease in schools whose student populations are > 50% white.

Turns out my memory was correct, and according to the National Center for Education Statistics, in the fall of 2014, white students became a minority majority in the school system at 49.5% of the school age population.  For context, when the graph starts (1954) the US was about 89% white. I couldn’t find what that number was for just school age kids, but it was likely much higher than 49.5%.   So basically if you drew a similar graph for any other race, including white kids, you would see a drop. When the tide goes down, every related metric goes down with it.

Now to be clear, I am not saying that school segregation isn’t a problem or that the Vox article gets everything wrong. My concern is that graph was used as one of their first images in a very lengthy article, and they don’t mention the context or what that might mean for advocacy efforts. Looking at that graph, we have no idea what percentage of that drop is due to a shrinking white population and what is due to intentional or de facto segregation. It’s almost certainly not possible to substantially raise the number of kids going to schools who have more than 50% white kids, simply because the number of schools like that is shrinking.  Vox has other, better, measures of success further down in the article, but I’m disappointed they chose to lead with one that has a major confounder baked in.

This is of course the major problem with tidal statistics. The implication tends to be “this trend is bad, following our advice can turn it around”. However, if the trend is driven by something much broader than what’s being discussed, any results you get will be skewed. Some people exploit this fact, some step in to it accidentally, but it is an interesting way that you can tell the truth and mislead at the same time.

Stay safe out there.