“The first step is to measure whatever can be easily measured. This is OK as far as it goes. The second step is to disregard that which can’t be easily measured or to give it an arbitrary quantitative value. This is artificial and misleading. The third step is to presume that what can’t be measured easily really isn’t important. This is blindness. The fourth step is to say that what can’t be easily measured really doesn’t exist. This is suicide.” – Daniel Yankelovich
“Andy Grove had the answer: For every metric, there should be another ‘paired’ metric that addresses the adverse consequences of the first metric” -Marc Andreessen
“I didn’t feel the ranking system you created adequately captured my feelings about the vendors we’re looking at, so instead I assigned each of them a member of the Breakfast Club. Here, I made a poster.” -me
I have a confession to make: I don’t always like metrics. There. I said it. Now most people wouldn’t hesitate to make a declaration like that, but for someone who spends a good chunk of her professional and leisure time playing around with numbers it’s kind of a sad thing to have to say. Some metrics are totally fine of course, and super useful. On the other hand, there are times when it seems like the numbers subsume the actual goal, and those become front and center. This is bad. In statistics, numbers are a means to an end, not the end. I need a name for this flip flop, so from here on out I’m calling it “The Perfect Metric Fallacy”.
The Perfect Metric Fallacy: The belief that if one simply finds the most relevant or accurate set of numbers possible, all bias will be removed, all stress will be negated, and the answer to complicated problems will become simple, clear and completely uncontroversial.
As someone who tends to blog about numbers and such, I see this one a lot. On the one hand, data and numbers are wonderful because they help us identify reality, improve our ability to compare things, spot trends, and overcome our own biases. On the other hand, picking the wrong metric out of convenience or bias and relying too heavily on it can make everything I just named worse plus piss everyone around you off.
While I have a decent number of my own stories about this, what frustrates me is how many I hear from others. When I tell people these days that I’m in to stats and data, almost a third of people respond with some sort of horror story about how data or metrics are making their professional lives miserable. When I talk to teachers, this number goes up to 100%.
This really bums me out.
It seems after years of disconnected individuals going with their guts and kind of screwing everything up, people decided that now we should put numbers on those grand ideas to prove that they were going to work. When these ideas now fail, people either blame the numbers (if you’re the person who made the decision) or the people who like the numbers (if you’re everybody else). So why do we let this happen? Almost everyone up front knows that numbers are really just there to guide decision making, so why do we get so obsessed with them?
- Math class teaches us that if you play with numbers long enough, there will be a right answer There’s a lot of times in life when your numbers have to be perfect. Math class. Your tax return. You know the drill. Endless calculations, significant figures, etc, etc. In statistics, that’s not true. It’s a phenomena known as “false precision“, where you present data in a way that makes it look more accurate than it really can be. My favorite example of this is a clinic I worked with at one point. They reported weight to two significant figures (as in 130.45 lbs), but didn’t have a standard around whether or not people had to take their coat off before they weighed them. In the beginning of the post, I put a blurb about me converting a ranking system in to a Breakfast Club Poster. This came up after I was presented with a 100 point scale to rank 7 vendor against each other in something like 16 categories. When you have 3 days to read through over 1000 pages of documentation and assign scores, your eyes start to blur a little and you start getting a little existential about the whole thing. Are these 16 categories really the right categories? Do they cover everything I’m getting out of this? Do I really feel 5 points better about this vendor than that other one, and are both of them really 10 points better than that 3rd one? Or did I just start increasing the strictness of my rankings as I went along, or did I get nicer as I had to go faster, or what? It wasn’t a bad ranking system, but the problem was me. If I can’t promise I kept consistent in my rankings over 3 days, how can I attest to my numbers at the end?
- We want numbers to take the hit for unpleasant truths A few years ago someone sent me a comic strip that I have promptly sent along to nearly everyone who complains to me about bad metrics in the workplace: This almost always gets a laugh, and most people then admit that it’s not the numbers they have a problem with, it’s the way they’re being used. There’s a lot of unpleasant news to deliver in this world, and people love throwing up numbers to absorb the pain. See, I would totally give you a raise or more time to get things done but the numbers say I can’t. When people know you’re doing exactly what you were going to do to begin with, they don’t trust any number you put up. This gets even worse in political situations. So please, for the love of God, if the numbers you run sincerely match your pre-existing expectations, let people look over your methodology, or show where you really tried to prove yourself wrong. Failing to do this gives all numbers a bad rap.
- Good Data is Hard to Find One of the reasons statistician continues to be a profession is because good data is really really really hard to find, and good methods for analysis actually require a lot of leg work. Over the course of trying to find a “perfect metric” many people end up believing that part of being “perfect” is being easily obtainable. As my first quote mentions, this is ridiculous. It’s also called the McNamara Fallacy, and it warns us that the easiest things to quantify are not always the most important.
- Our social problems are complicated The power of numbers is strong. Unfortunately, the power of some social problems is even stronger. Most of our worst problems are multi faceted, which of course is why they haven’t been solved yet. When I decided to use metrics to address my personal weight problem, I came up with 10 distinct categories to track for one primary outcome measure. That’s 365,000 data points a year, and that’s just for me. Scaling that up is immensely complicated, and introduces all sorts of issues of variability among individuals that don’t exist when you’re looking at just one person. Even if you do luck out and find a perfect metric, in a constantly shifting system there is a good chance that improving that metric will cause a problem somewhere else. Social structures are like Jenga towers, and knocking on piece out of place can have unforeseen consequences. Proceed with caution, and don’t underestimate the value of small successes.
Now again, I do believe metrics are incredibly valuable and used properly can generate good insights. However, in order to prevent your perfect metric from turning in to a numerical bludgeon, you have to keep an eye on what your goal really is. Are you trying to set kids up for success in life or get them to score well on a test? Are you trying to maximize employee productivity or keep employees over the long term? Are you looking for a number or a fall guy? Can you know what you’re looking to find out with any sort of accuracy? Things to ponder.
4 thoughts on “The Perfect Metric Fallacy”
WRT right answers, I’m reliably informed that income tax law is inconsistent. The same data can be analyzed different ways, and result in different tax liabilities.
Good data is expensive.
“Good, fast, cheap: pick one.”
Maybe for social statistics, the choices should be “Cheap, accurate, useful–pick one” Even when the problem doesn’t lend itself to numbers, there seems to be a push to pick some metric so the decision will be impartial. And “scientific”
I am sure you already know Einstein’s statement: Perhaps my paraphrase.
“All that can be counted doesn’t count.
And all that counts can’t be counted.”
Oh, you mean those satisfaction surveys businesses send out – including hospitals – don’t actually capture how well the business is doing? At least they provide work for people who are supposed to be tracking “quality.” And don’t get me started on Wellness.
The sociobiology and Human Biodiversity people like to work with things that are easier to measure: crime rates, # children per female, IQ, track-and-field numbers. Most of them seem to understand pretty thoroughly that a lot of other things go into human value, things which are harder to measure. It is the semi-educated general audience that leaps to the conclusion that some inflated claim is being made about people being “better” on the basis of a single metric, and so they start to make ridiculous counterclaims. Or they pick some other measure that they like and go with that – like years of education, or income, which don’t give as clear a picture of what is being measured.
To put it in baseball terms – baseball is often a useful analogy for anything statistical – it’s like tracking strikeouts for a pitcher. It tells you something nice. But it doesn’t tell you everything, so people either overrate or underrate its meaning.
Pingback: The White Collar Paradox | graph paper diaries
Comments are closed.