Rubin Vase Reporting

Jesse Singal had an interesting post in his (subscriber only) newsletter this week about a some articles promoting an Amnesty International report that ran under the headline “Amnesty reveals alarming impact of online abuse against women“.  I was intrigued because I love dissections of survey data, and this didn’t disappoint. He noted some inappropriate extrapolations from the results (the Mozilla article claimed that data showed women were harassed more than men online, but the Amnesty survey didn’t survey any men and thus has no comparison), and also that the numbers were a little lower than he thought. Overall in 8 countries, an average of 23% of women had experienced online harassment, with an average of 11% saying they’d experienced online harassment more than once.

This statistic struck me as interesting, because it sounds really different depending on how you phrase it. From the Amnesty article:

Nearly a quarter (23%) of the women surveyed across these eight countries said they had experienced online abuse or harassment at least once, ranging from 16% in Italy to 33% in the US.

If you reverse the language, it reads like this:

“Over three quarters (77%) of the women surveyed across these eight countries said they had never experienced online abuse or harassment even once, ranging from 84% in Italy to 67% in the US.”

Now it is possible those two paragraphs sound exactly the same to you, but to me they give slightly different impressions. By shifting the focus from the positive responses to the negative, two reporters could report the exact same data but give slightly different impressions.

While reading this, all I could think of was the famous Rubin Vase illusion. If you don’t recognize the name, you will almost certainly recognize the picture: 

It struck me as a good analogy for a certain type of statistics reporting, enough so that I decided to give it a name:

Rubin Vase Reporting: The practice of grounding a statistic in either the positive (i.e. % who said yes) or negative (i.e. % who said no) responses in order to influence the way the statistic is read and what it appears to show.

Now of course not every statistic is reported this way intentionally (after all you really do have to pick one way to report most statistics and then stick with it), but it is something to be aware of. Flipping statistics around to see how you feel about them when they’re said in the reverse can be an interesting practice.

Also, I have officially updated my GPD Lexicon page, so if you’re looking for more of these you may want to check that out! I have 19 of these now and have been pondering putting them in to some sort of ebook with illustrations, just for fun. Thoughts on that also welcome.

Reporting the High Water Mark

Another day, another weird practice to add to my GPD Lexicon.

About two weeks ago, a friend sent me that “People over 65 share more fake news on Facebook” study to ask me what I thought. As I was reviewing some of the articles about it, I noticed that they kept saying the sample size was 3,500 participants. As the reporting went on however, the articles clarified that not all of those 3,500 people were Facebook users, and that about half the sample opted out. Given that the whole premise of the study was that the researchers had looked at Facebook sharing behavior by asking people for access to their accounts, it seemed like that initial sample size wasn’t reflective of those used to obtain the main finding. I got curious how much this impacted the overall number, so I decided to go looking.

After doing some follow up with the actual paper, it appears that 2,771 of those people had Facebook to begin with,  1,331 people actually enrolled in the study, and 1,191 were able to link their Facebook account to the software the researchers needed. So basically the sample size the study was actually done on is about a third of the initially reported value.

While this wasn’t necessarily deceptive, it did strike me as a bit odd. The 3,500 number is one of the least relevant numbers in that whole list. It’s useful to know that there might have been some selection bias going on with the folks who opted out, but that’s hard to see if you don’t report the final number.  Other than serving as a selection bias check though (which the authors did do), 63% of the participants had no link sharing data collected on them, and thus are irrelevant to the conclusions reported.  I assumed at first that reporters were getting this number from the authors, but it doesn’t seem like that’s the case.  The number 3,500 isn’t in the abstract. The press release uses the 1,300 number. From what I can tell, the 3,500 number is only mentioned by itself in the first data and methods section, before the results and “Facebook profile data” section clarify how the interesting part of the study was done. That’s where they clarify that 65% of the potential sample wasn’t eligible or opted out.

This was not a limited way of reporting things though, as even the New York Times went with the 3,500 number. Weirdly enough, the Guardian used the number 1,775, which I can’t find anywhere. Anyway, here’s my new definition:

Reporting the high water mark: A newspaper report about a study that uses the sample size of potential subjects the researchers started with, as opposed the sample size for the study they subsequently report on.

I originally went looking for this sample size because I always get curious how many 65+ plus people were included in this study. Interestingly, I couldn’t actually find the raw number in the paper. This strikes me as important because if older people are online in smaller numbers thank younger ones, the overall number of fake stories might be larger among younger people.

I should note that I don’t actually think the study is wrong. When I went looking in the supplementary table, I noted that the authors mentioned that the most commonly shared type of fake news article was actually fake crime articles. At least in my social circle, I have almost always seen those shared by older people rather than younger ones.

Still, I would feel better if the relevant sample size were reported first, rather than the biggest number the researchers looked at throughout the study.

GPD Lexicon: Proxy Preference

It’s been a little while since I added anything to the GPD Lexicon, but I got inspired this week by a Washington Post article on American living situations. It covered a Gallup Poll that asked people an interesting question:

“If you could live anywhere you wished, where would you prefer to live — in a
big city, small city, suburb of a big city, suburb of a small city, town, or rural area?”

The results were then compared to where people actually live, to give the following graph:

Now when I saw this I had a few thoughts:

  1. I wonder if everyone actually knew what the definition of each of those was when they answered.
  2. I wonder if this is really what people want.

The first was driven by my confusion over whether the town I grew up in would be considered a town or a “small city suburb”.

The second thought was driven by my deep suspicion that almost 30% of the US actually wanted to live in rural areas, whereas only half that number actually live in one. While I have no doubt that many people actually do want to live in rural areas, it seems like for at least some people that might be a bit of a proxy for something else. For example, one of the most common reason for moving away from rural areas is to find work elsewhere. Did saying you wanted to live in a rural area represent (for some people) a desire to not have to work or to be able to work less? A desire to not have economic factors influence where you live?

To test this theory, I decided to ask a few early to mid 20s folks at my office where they would live if they could live anywhere. All of them currently live in the city, but all gave different answers.  This matched the Gallup poll findings, where 29% of 18-29 year olds were living in cities, but only 17% said they wanted to. As they put it:

“One of the most interesting contrasts emerges in reference to big-city suburbs. The desire to live in such an area is much higher among 18- to 29-year-olds than the percentage of this age group who actually live there. (As reviewed above, 18- to 29-year-olds are considerably more likely to want to live in a big-city suburb than in a big city per se.)”

Given this, it seems like if I asked any of my young coworkers if they wanted to rent a room from me in my large city suburb home, they’d say yes. And yet I doubt they actually would. When they were answering,  almost none of them were talking about their life as it currently stands, but more what they hope their life could be. They wanted to get married, have kids, live somewhere smaller or in the suburbs. Their vision of living in the suburbs isn’t just the suburbs, it’s owning their own home, maybe having a partner, a good job, and/or kids. They don’t want a room in my house. They want their own house, and a life that meets some version of what they call success.

I think this is a version of what economists call a “revealed preference“, where you can tell what people really want by what they actually buy. In this version though, people are using their answers to one question to express other desires that are not part of the question. In other words this:

Proxy Preference: A preference or answer given on a survey that reflects a larger set of wants or needs not reflected in the question.

An example: Some time ago, I saw a person claiming that women should never plan to return to the workforce after having kids, because all women really wanted to work part time. To prove this, she had pointed to a survey question that asked women “if money were not a concern, what would your ideal work set up be?”. Unsurprisingly, many women said they’d want to work part time. I can’t find it now, but that question always seemed unfair to me. Of course lots of people would drop their hours if they had no money concerns! While many of us are lucky enough to like a lot of what we do, most of us are ultimately working for money.

A second example: I once had a pastor mention in a sermon that as a high schooler he and his classmates had been asked if they would rather be rich, famous, very beautiful or happy. According to his story, he was one of the only people who picked “happy”. When he asked his classmates why they’d picked the other things, they all replied that if they had those things they would be happy. It wasn’t that they didn’t want happiness, it was that they believed that wealth, fame and beauty actually led directly to happiness.

Again, I don’t think everyone who says they want to live in a rural area only means they want financial security or a slower pace of life, but I suspect they might. It would be interesting to narrow the question a bit to see what kind of answers you’d get. Phrasing it “if money were no object, where would you prefer to live today?” might reveal some interesting answers. Maybe ask a follow up question about “where would you want to live in 5 or 10 years?”, which might reveal how much of the answer had something to do with life goals.

In the meantime though, it’s good to remember that when a  large number of people say they’d prefer to do something other than what they are actually doing, thinking about the reasons for the discrepancy can be revealing.

Delusions of Mediocrity

I mentioned recently that I planned on adding monthly(ish) to my GPD Lexicon page, and my IQ post from Sunday reminded me of a term I wanted to add. While many of us are keenly aware of the problem of “delusions of grandeur” (a false sense of one’s own importance), I think fewer people realize that thinking oneself too normal might also be a problem.

In some circles this happens a lot when topics  like IQ or salary come up, and a bunch of college educated people sit around and talk about how it’s not that much of an advantage to have a higher IQ or having an above average salary. While some people saying this are making good points, some are suffering a delusion of mediocrity. They are imagining in these discussions that their salary or IQ is “average” and that everyone is working in the same range as them and their social circle. In other words, they are debating IQ while only thinking about those with IQs above 110 or so, or salaries above the US median of $59,000.  In other words:

Delusions of Mediocrity: A false sense of one’s one averageness. Typically seen in those with above average abilities or resources who believe that most people live like they do.

Now I think most of us have seen this on a personal level, but I think it’s also important to remember it on a research level. When research finds things like “IQ is correlated with better life outcomes”, they’re not just comparing IQs of 120 to IQs of 130 and finding a difference….they’re comparing IQs of 80 to IQs of 120 and finding a difference.

On an even broader note, psychological research has been known to have a WEIRD problem. Most of the studies we see describing “human” behavior are actually done on those in Western, educated, industrialized, rich and democratic countries (aka WEIRD countries) that do NOT represent that majority of the world population. Even things like optical illusions have been found to vary by culture, so how can we draw conclusions about humanity while drawing from a group that represents only 12% of the world’s population? The fact that we don’t often question this is a mass delusion of mediocrity.

I think this all gets tempting because our own social circles tend to move in a narrow range. By virtue of living in a country, most of us end up seeing other people from that country the vast majority of the time. We also self segregate by neighborhood and occupation. Just another thing to keep in mind when you’re reading about differences.

Tidal Statistics

I’m having a little too much fun lately with my “name your own bias/fallacy/data error” thing, so I’ve decided I’m going to make it a monthly-ish feature. I’m gathering the full list up under the “GPD Lexicon” tab.

For this month, I wanted to revisit a phrase I introduced back in October: buoy statistic. At the time I defined the term as:

Buoy statistic: A statistic that is presented on its own as free-floating, while the context and anchoring data is hidden from initial sight.

This was intended to cover a pretty wide variety of scenarios, such as when we hear things like “women are more likely to do thing x” without being told that the “more likely” is 3 percentage points over men.

While I like this term, today I want to narrow it down to a special subcase: tidal statistics. I’m defining those as…..

Tidal Statistic: A metric that is presented as evidence of the rise or fall of one particular group, subject or issue, during a time period when related groups also rose or fell on the same metric

So for example, if someone says “after the CEO said something silly, that company’s went down on Monday” but they don’t mention that the whole stock market went down on Monday, that’s a tidal statistic. The statement by itself could be perfectly true, but the context changes the meaning.

Another example: recently did an article about racial segregation in schools in which they presented this graph:

Now this graph initially caught my eye because they had initially labeled it as being representative of the whole US (they later went back and corrected it to clarify that this was just for the south), and I started to wonder how this was impacted by changing demographic trends. I remembered seeing some headlines a few years back that white students were now a minority-majority among school age children, which means at least some of that drop is likely due a decrease in schools whose student populations are > 50% white.

Turns out my memory was correct, and according to the National Center for Education Statistics, in the fall of 2014, white students became a minority majority in the school system at 49.5% of the school age population.  For context, when the graph starts (1954) the US was about 89% white. I couldn’t find what that number was for just school age kids, but it was likely much higher than 49.5%.   So basically if you drew a similar graph for any other race, including white kids, you would see a drop. When the tide goes down, every related metric goes down with it.

Now to be clear, I am not saying that school segregation isn’t a problem or that the Vox article gets everything wrong. My concern is that graph was used as one of their first images in a very lengthy article, and they don’t mention the context or what that might mean for advocacy efforts. Looking at that graph, we have no idea what percentage of that drop is due to a shrinking white population and what is due to intentional or de facto segregation. It’s almost certainly not possible to substantially raise the number of kids going to schools who have more than 50% white kids, simply because the number of schools like that is shrinking.  Vox has other, better, measures of success further down in the article, but I’m disappointed they chose to lead with one that has a major confounder baked in.

This is of course the major problem with tidal statistics. The implication tends to be “this trend is bad, following our advice can turn it around”. However, if the trend is driven by something much broader than what’s being discussed, any results you get will be skewed. Some people exploit this fact, some step in to it accidentally, but it is an interesting way that you can tell the truth and mislead at the same time.

Stay safe out there.


Magnitude Problems, Now With Names

In my last blog post, I put out a call for name ideas for a particular “potentially motivated failure to recognize that the magnitude of numbers matters” problem I was seeing, and man did you all come through! There were actually 3 suggestions that got me excited enough that I wanted to immediately come up with definitions for them, so I now have 3 (actually 4) new ways to describe my problem. A big thanks to J.D.P Robinson, Korora, and the Assistant Village Idiot for their suggestions.

Here are the new phrases:

Hrair Line: a somewhat arbitrary line past which all numbers seem equally large

Based on the book “Watership Down” where characters use the word “hrair” to mean “any number greater than 4”.  We all have a line like this when numbers get big enough….I doubt any of us truly registers the difference between a quadrillion and a sextillion unless we encounter those numbers in our work. Small children do this with time (anything other than “right now” is “a long time”), and I’d guess all but the richest of us do this with money (a yearly salary of $10 million and $11 million are both just “more than I make” to me). On it’s own, this is not necessarily a bad thing, but rather a human tendency to only wrap our heads around the number values that matter most to us. This tendency can be misused however, which is where we get….

The Receding Hrair Line: The tendency to move one’s hrair line based on the subject under discussion, or for one group and not another, normally to benefit your argument

Also known (in my head) as the Soros/Koch brothers problem. Occasionally you’ll see references to charitable gifts by those controversial figures, and it’s always a little funny to see how people perceive those numbers based on their pre-conceived feelings about Soros/Koch. I’ve seen grants of $5000 called “a small grant” or be credited with helping fund the whole organization. You could certainly defend either stance in many cases, but my concern is that people frequently seem to start from their Soros/Koch feelings and then bring the numbers along for the ride. They are not working from any sort of standard for what a $5000 grant means to a charity, but rather a standard for what a George Soros or Koch brothers gift means and working backwards. This can also lead too….

Mountain-Molehill Myopiathe tendency to get so fixated on an issue that major changes in magnitude of the numbers involved do not change your stance. Alternatively, being so fixated on an issue that you believe that any change to the number completely proves your point.

A close relative of number blindness, but particularly focused on the size of the numbers. Taking my previous Soros/Koch example, let’s say someone had defend the “a $5000 grant is not a big deal” stance. Now let’s say that there was a typo here, and it turned out that was a $50,000  or a $500 grant. For most people, this would cause you to stop and say “ok, given this new information, let me rethink my stance”. For those suffering from Mountain-Molehill Myopia however, this doesn’t happen. They keep going and act like all their previous logic still stands. This is particularly bizarre, given that most people would have no problem with you pausing to reassess given new information. All but the most dishonest arguers are going to hold you accountable for previous logic if new information comes up. The refusal to do so actually makes you more suspect.

The alternative case here is when someone decides that a small change to the numbers now means EVERYTHING has changed. For example, let’s say the $5000 turns out to be $4900 or $5100. That shouldn’t change anything (unless there are tax implications that kick in at some level of course), but sometimes people seriously overreact to this. You said $5000 and it turns out it was $4900, this means your whole argument is flawed and I automatically win.

There is clearly a sliding scale here, as some changes are more borderline. A $5000 grant vs a $2000 grant may be harder to sort through. For rule of thumb purposes, I’d say an order of magnitude change requires a reaction, and less than that is a nuanced change. YMMV.

Now, all of these errors can be annoying in a vacuum, but they get worse when onlookers start jumping in. This is where you get…..

Pyrgopolynices’ numbers: Numbers that are wrong or over-inflated, but that you believe because they are supported by those around you due to tribal affiliations rather than independent verification

Based on the opening scene of  Plautus’  Braggart Soldier, Korora provided me with the context for this one (slightly edited from the original comment):

…the title character’s parasītus , or flatterer-slave, is repeating to his master said master’s supposed achievements on the battlefield:

Artotrogus:. I remember: One hundred fifty in Cilicia. A hundred in Scytholatronia*, thirty Sardians, sixty Macedonians. Those are the men thou slewest in one day.
Pyrgopolynices: How many men is that?
Artotrogus: Seven thousand.
Pyrgopolynices: It must be as much. [Thou] correctly hast the calculation.

*there is no such place

After reading this I got the distinct feeling that we did away with flatterer-slaves, and replaced them with social media.

As someone who likes to correct others numbers, you’d think I’d be all about chiming in on Facebook/Twitter/whatever  conversations about numbers or stats, but I’m not. Starting about 3 years ago, I stopped correcting anyone publicly and started messaging people privately when I had concerns about things they posted. While private messages seemed to get an amiable response and a good discussion almost 90% of the time, correcting someone publicly seemed to drive people out of the woodwork to claim that those numbers were actually right. Rather than acknowledge the error as they would privately, my friends would then turn their stats claims in to Pyrgopolynices’ numbers….numbers that people believed because other people were telling them they were true. Of course those people were only telling them they were true because someone on “their side” had said them to begin with, so the sense of check and balances was entirely fictitious.

Over the long term, this can be a very dangerous issue as it means people can go years believing certain things are true without ever rechecking their math.

That wraps it up! Again, thank you to J.D.P Robinson for mountain-molehill myopia, AVI for throwing the word “hrair” out there, and Korora for the backstory on Pyrgopolynices’ numbers. In related news, I think I may have to start a “lexicon” page to keep track of all of these.

The (Magnitude) Problem With No Name

As most of you know, I am a big fan of amusing myself by coining new names for various biases/numerical tomfoolery I see floating around on the internet. I have one that’s been bugging me for a little while now, but I can’t seem to find a good name for it. I tried it out on a bunch of people around Christmas (I am SUPER fun at parties guys), but while everyone got the phenomena, no one could think of a pithy name. Thus, I turn to the internet.

The problem I’m thinking of is a specific case of what I’ve previously called Number Blindness  or “The phenomena of becoming so consumed by an issue that your cease to see numbers as independent entities and view them only as props whose rightness or wrongness is determined solely by how well they fit your argument”. In this case though, it’s not just that people don’t care if their number is right or wrong, it’s that they seem oddly unmoved by the fact that the number they’re using isn’t even the right order of magnitude. It’s as though they think that any “big” number is essentially equal to any other big number, and therefore accuracy doesn’t matter any more.

For example, a few weeks ago Jenna Fischer (aka Pam from the Office) got herself in trouble by Tweeting out (inaccurately) that under the new tax bill teachers could no longer deduct their classroom expenses. She deleted it, but while I was scrolling through the replies I came across an exchange that went something like this:

Person 1: Well teachers wouldn’t have to buy their own supplies if schools stopped paying their football coaches $5 million a year

Person 2: What high school pays their coach $5 million a year?

Person 3: 28 coaches in Texas make over $120,000 a year.

Person 2: $120,000 is not $5 million.

Person 3: Well that’s part of an overall state budget of $20-25 million just for football coaches. (bs king’s note: I couldn’t find a source for this number, none was given in the Tweet)

Person 2: ….

Poor person 2.

Now clearly there was some number blindness here….person 1 and 3 only seemed to care about the idea that numbers could support their cause, not the accuracy of said numbers. But it was the stunning failure to recognize order of magnitude that took my breath away. How could you seriously reply to a comment about $5 million dollar salaries with an article about $120,000 dollar salaries and feel you’d proved a point? Or respond to a second query with an overall state budget, which is an order of magnitude higher than that? It’s like some sort of big number line got crossed, and now it’s all fair game.

I suspect this happens more often the bigger the numbers get….people probably drive astronomers nuts by equating things like a billion light years and a trillion light years away. Given that I’ve probably done this I won’t get too cocky here, but I would like a name for the phenomenon. Any thoughts are appreciated.