What I’m Reading: July 2016

This month my book was The Signal and the Noise,  which I enjoyed enough that I’m doing a chapter by chapter contingency matrix series on it over at the other blog.

Sampling strategy and research design can sound really boring, until you blow through $1.3 billion dollars and have nothing to show for it. This article on the long slow death of the National Children’s Study should be assigned reading for anyone who ever wanted to know why it was so damn hard to get good research done.

Did you hear the one about all the Brexit voters furiously Googling “What is the EU?” after they voted to leave it? Yeah? That was pretty bogus. It was about 1000 people total, no one knows if their Googling was “furious”, how they voted, or if those people were even eligible to vote.

This article is from a few months ago, but it’s an interesting look at motivations and political bias. It turns out people do better on “political fact” tests when you offer them money for right answers than when they take them with no incentives.  The Volokh Conspiracy discusses implications for our understanding of political ignorance.

Also from a few months ago: the Quartz guide to bad data. More properly it might be called “guide to cleaning up your spreadsheet”. If you ever actually get a large data file and don’t know how to find potential problems before you analyze it, this is a good start.

Another good guide is this list of data science books from Stitch Fix. Stitch Fix is an online personal stylist service that I just so happen to use to get most of my work clothes. They also have a REALLY active data science division that helps come up with clothing recommendations. Good stuff.

This is an interesting data visualization of the changing American obesity rates.

I actually listened to this one, but there was an interesting piece on Science Friday about “differential privacy” and response randomization. The transcript is available here,  and there’s some interesting discussion about honesty, privacy, and research in the big data era.