I’ve been meaning to post something on David Brooks (Brooks’s? Brooks’?) column from a few weeks ago on the “Philosophy of Data”. A couple readers sent this to me (thanks all!) and I thought it was pretty interesting. He questions how the rise of big data is going to change things, and raises a few pertinent questions:
Over the next year, I’m hoping to get a better grip on some of the questions raised by the data revolution: In what situations should we rely on intuitive pattern recognition and in which situations should we ignore intuition and follow the data? What kinds of events are predictable using statistical analysis and what sorts of events are not?
I think those questions are relevant, and I was thinking about them when this cartoon popped up in my newsfeed on Facebook a few days ago:
In the post election fallout, a lot of the geek blogs I read questioned deeply Romney’s data collection. Several supposed insiders claimed that while there were many in his campaign charged with data collection, he lacked people who were performing what is scientifically known as “the sniff test”.
Now I have no idea if the stuff about the Romney campaign is true (though I did know some folks on the Obama team and their data gathering was quite stunning to the point of mildly creepy), but I think that raising questions about data vs gut reactions are going to be big battles in areas like politics. I mean, anyone who’s seen or read Moneyball knows that it took a while to get this in to baseball, and baseball’s got far fewer moving targets than politics.
What I think is interesting though is that integrating large data sets in to a highly charged and changing environment actually isn’t that hard, and I’m not sure why intuition and data get set up as opposing forces. They actually work quite well together if you let them. Here’s the basic steps:
- Figure out what problem you’re trying to solve
- Get a large relevant data set
- Analyze it until you get any numbers you can think of that might be helpful
- Find several rational people who are deeply embedded in the problem area
- Ask them what they think of the data, get the gut reaction
- Explain to them where you got the data, see if their reaction is the same
- Ask them if anyone they know would disagree with this data, and if so why
- Ask them if this helps them know how to proceed
- Ask them if there’s any other data that might be useful for this problem
- Go find that, repeat 5-9.
Data is helpful, but easily manipulated. We need a combination of data and good gut reactions to figure out where to go in high stakes environments. People directly involved in a situation are always going to be both the best and worst judge of the situation….and that’s okay. Data geeks should set themselves somewhere in the middle, and always be questioning. Data doesn’t make you an expert, but it can give you standing to challenge the experts.
There’s no magic bullet here. There’s only another very useful tool in (what should already be) a well stocked tool box.