Tracking the wild bad data

As someone who spent 3 years studying family dynamics in grad school, I was pretty interested in the NYT piece that ran last week on class divides in single vs married households.  The article generated a lot of buzz, and if you haven’t read it, I would recommend it.

People seemed to either love or hate this article, and it’s stirred up a whole lot of discussion online.  One of the more interesting points that got brought up though, was a discussion about why the focus was on single moms as opposed to deadbeat dads.

This led to some quoting of an interesting statistic regarding custodial parents and child support.  When I first read this statistic, it was from Amanda Marcotte over at Slate who put it this way:

…. in a substantial number of cases, the men just quit their families. That’s why only 41 percent of custodial parents receive child support.

Now, I’ve perused internet comment boards enough to know that there are a LOT of men out there griping about how much they pay in child support.  I was a little shocked to read that apparently 59% don’t give anything.  I clicked on the closest link she had provided…..which took me over to the NYT Economix blog and an item by Nancy Folbre. There was the stat again, except with a few more qualifiers:

In 2009, the latest year for which data are available, only about 41 percent of custodial parents (predominantly women) received the child support they were owed. Some biological dads were deadbeats. 

So that frames it a little differently.  It’s still a little unclear from that statement, but it started to occur to me that this probably meant only 41% were up to date on their support payments…not that only 41% of non-custodial parents were paying.

I clicked on the link provided by Folbre, and got to the Census Bureau website, which put it all this way:

 In 2009, 41.2 percent of custodial parents received the full amount of child support owed them, down from 46.8 percent in 2007, according to a report released today by the U.S. Census Bureau. The proportion of these parents who were owed child support payments and who received any amount at all — either full or partial — declined from 76.3 percent to 70.8 percent over the period.

Now that’s still a lot of deadbeats, but it is a slightly different picture from the one we originally started with.  When I clicked on the link from the Census Bureau snapshot to the report it originally came from, I noticed something else interesting….only about half of all custodial parents have court ordered support, and the non-payment stats above appear to reflect only what is happening in the court ordered cases.  The non court ordered cases are certainly hazy….30% of custodial parents said they never went to court because they knew the other person couldn’t pay….but it is interesting that the quoted stats only apply to half of the custodial parent cases.

Overall, I must say I kind of enjoyed attempting tracking the evolution of a stat (in reverse).  It’s not often you get to actually see how things evolve from the primary source to several steps out….and it was an interesting mental exercise.  Thanks for taking the journey with me.

Review and redraft – research in government

A few months ago, my father let me know that New Hampshire had passed a law that required the various government agencies to update their rules/statutes every few years (5 years? 7 years? Dad, help me out here).  I’m not entirely sure what the scope of this law was, but my Dad mentioned that it was actually quite helpful for his work at the DMV.  It had surprised him how many of their rules did not actually reflect the changing times, and how helpful it was to update them.  One of the biggest rules they had found to update was that in certain situations, they were still only allowed to accept doctor’s notes from M.D.s….so anyone who used a nurse practitioner for primary care couldn’t get an acceptable note….despite NPs being perfectly qualified to comment on the situations they were assessing.  It wasn’t that the note needed to be from an MD, it was just that when the rule was written, very few people had anything other than a primary care MD.  I found the entire idea pretty good and proactive.

I was thinking about that after my post yesterday on South Dakota’s law regarding abortion risk disclosure.  I was wondering how many, if any, states require that laws based primarily on current scientific research  review those laws in any given time period.

Does anyone know if any states require this?  Or is this solely up to those who oppose certain laws to challenge things later?  

Correlation and Causation – Abortion and Suicide meet the 8th circuit

Perhaps it’s lawyer’s daughter in me, but I think watching courts rule on presentation of data is totally fascinating to me.

Today, the 8th Circuit Court of Appeals had to make just such a call.

The case was Planned Parenthood v Mike Rounds and was a challenge to a 2005 law that required doctors to inform patients seeking abortions that there was “an increased risk of suicide ideation and suicide”.  This was part of the informed consent process under the “all known medical risks” section.

Planned Parenthood challenged on the grounds that this was being presented as a causal link, and was therefore was a violation of the doctor’s freedom of speech.

It’s a hot topic, but I tried to get around the controversy to the nuts and bolts of the decision. I was interested how the courts evaluated what research should be included and how.

Apparently the standard is as follows:

…while the State cannot compel an individual simply to speak the State’s ideological message, it can use its regulatory authority to require  a  physician to provide  truthful,  non-misleading  information relevant to a patient’s decision to have an abortion, even if that information might also encourage the patient to choose childbirth over abortion.”  Rounds, 530 F.3d at 734-35; accord Tex. Med. Providers Performing Abortion Servs. v. Lakey, 667 F.3d 570, 576-77 (5th Cir. 2012).  

So in order to be illegal, disclosures must be proven to be ““either  untruthful, misleading or not relevant to the patient’s decision to have an abortion.”


It was the misleading part that the challenge focused on.  The APA has apparently endorsed the idea that any link between abortion and suicide is NOT causal.  The theory is that those with pre-existing mental health conditions are both more likely to have unplanned pregnancies and to later commit suicide. It was interesting to read the huge debate over whether the phrase “increased risk” implied causation (the court ruled causation was not implicit in this statement).


Ultimately, it was decided that this statement would be allowed as part of informed consent.  The conclusion was an interesting study in what the courts will and will not vouch for:

We acknowledge that these studies, like the studies relied upon by the State and Intervenors, have strengths as well as weaknesses. Like all studies on the topic, they must make use of imperfect data that typically was collected for entirely different purposes, and they must attempt to glean some insight through the application of sophisticated statistical techniques and informed assumptions. While the studies all agree that the relative risk of suicide is higher among women who abort compared to women who give birth or do not become pregnant, they diverge as to the extent to which other underlying factors account for that link.  We express no opinion as to whether some of the studies are more reliable than others; instead, we hold only that the state legislature, rather than a federal court, is in the best position to weigh the divergent results and come to a conclusion about the best way to protect its populace.  So long as the means chosen by the state does not impose an unconstitutional burden on women seeking abortions or their physicians, we have no basis to interfere.

I did find it mildly worrisome that the presumption is that the state legislators are the ones evaluating the research.  On the other hand, it makes sense to put the onus there rather than the courts. It’s good to know what the legal standards are though….it’s not always about the science.

Political ages…mean vs median?

I just found out The Economist has a daily chart feature!

Today’s graph about age of population vs age of cabinet ministers is pretty fascinating:

It did leave me with a few questions though…..who did they count as cabinet ministers?  I don’t know enough about the governments in these countries to know what that equates to.  Also, why average vs median?  
I initially thought this chart might have been representing Congress, not the Cabinet.  I took a look at my old friend the Congressional Research Service Report and discovered that at the beginning of the 112th Congress in 2011, the average age was  57.7 years, which would make this chart about right.  I had to dig a bit further to get the ages of the Cabinet, but it turns out their average age is 59.75.  I was surprised the data points would be so close together actually….especially since that 57.7 was for Jan 2011, so it’s actually 59.2 or so now.  
In case you’re curious, 7 members of the cabinet are under 60.  The youngest is Shaun Donovan (46), Department of Housing and Urban Development.  The oldest is Leon Panetta (74), Department of Defense. Panetta is actually the only member over 70.  Half of them are in their 60s, 5 in the 50s, and 2 in their 40s.  
I felt a little ashamed I only could have given name/position to 5 of them before looking them all up.  That’s not great, especially when you realize I’m counting Biden.  Still, I comforted myself with the fact that I bet that beats a very large percentage of Americans.  
A quick look for other data suggests that median age of populations is the more commonly reported value.  The median age of the cabinet was actually 61, in case you’re curious.

Are law schools liable for misleading statistics?

An interesting snippet from over at the Volokh Conspiracy, where former students sued their law school for publishing misleading statistics.

The court ruled that the salary statistics published by the school were truly misleading, but in the end caveat emptor prevailed.  Apparently the schools had published average salary data, but only for those students who actually got jobs.  The court ruled that:

….even though Plaintiffs did not know the truth of how many graduates were used to calculate the average salary, at the very least, it is clear that the Employment Report has competing representations of truth. With red flags waiving and cautionary bells ringing, an ordinary prudent person would not have relied on the statistics to decide to spend $100,000 or more.

I do love legal language at times, and I was fairly amused by the phrase “competing representations of truth”. While in this case it was clear cut what information would have been most useful to the consumer, it’s often unclear what statistical breakdown represents “actual reality” and such.  I did think that perhaps the court was giving the public too much credit though, when it cited what an “ordinary prudent” person would do (or is it just that not many prudent people exist?).

I’ve been reading Tom Naughton’s blog quite a bit lately, and he often quotes his college physics professor’s advice to all of his students.  It’s a good quote, one that I think should be taught to all students freshmen year of high school.  In fact, it should have been used in this court decision:  “Learn math.  Math is how you know when they’re lying to you.”

Why do women quit science?

A week ago, I got forwarded this NPR article called “How Stereotypes Can Drive Women To Quit Science“.  It was sent to me by a friend from undergrad, female, successful, with both an undergrad and a grad degree in engineering.  She found it frustrating, and so do I.

Essentially, the article is about a study that tracked female science professor’s discussions at work (using some very cool/mildly creepy in ear recording devices), and came to the conclusion that women left science fields not because they were being overtly discriminated against, but because they’re scared that they might be.  This panic is apparently called “stereotype threat”, and is explained thusly:

When there’s a stereotype in the air and people are worried they might confirm the stereotype by performing poorly, their fears can inadvertently make the stereotype become self-fulfilling.

I figure this is why I only routinely make typos when someone is watching me type (interestingly, I made two just trying to get through that sentence).

Anyway, the smoking gun (NPRs words, not mine) was that:

When male scientists talked to other scientists about their research, it energized them. But it was a different story for women. “For women, the pattern was just the opposite, specifically in their conversations with male colleagues,” Schmader said. “So the more women in their conversations with male colleagues were talking about research, the more disengaged they reported being in their work.”Disengagement predicts that someone is at risk of dropping out. There was another sign of trouble.When female scientists talked to other female scientists, they sounded perfectly competent. But when they talked to male colleagues, Mehl and Schmader found that they sounded less competent.

The interpretation of  this data was curious to me. I wasn’t sure that social identity threat was the first theory I’d jump to, but I figured I’d read the study first.

It took me a little while to find the full paper free online, but I did it.  I got a little hung up on the conversation recording device part (seriously, it’s a very cool way of doing things….they record for 50 seconds every 9 minutes for the work day to eliminate the bias of how people recall conversations….read more here).

Here are the basics:  The sample size was 19 female faculty from the same research university.  Each was then “matched” with a male faculty member for comparison.  I couldn’t find the ages for the men, but they were matched on rank and department.  It appears the 19 women were out of 32 possibilities.  I’m unclear whether the remainder were unavailable or whether they declined. Genders did not have a difference in their levels of disengagement at the beginning of the study.

Unfortunately, they didn’t elaborate much on one thing I had a lot of questions about: how do you define competence.  They only stated that two different research assistants ranked it.  Since all of the researchers in this study were social psychologists, presumably so were their assistants.  It concerned me a bit that science faculty was being rated by people that wouldn’t know actual competence, merely the appearance of it (the study authors admit this is a weakness).

Another interesting point is that job disengagement was only measured up front.  When I had initially read the report on the study, I had inferred that they were taking data post conversation to see the change.  They weren’t.  They took it up front, then found that the more disengaged women had a higher percentage of total discussions about work with men than the other women were.  It occurred to me that this could merely be a sign of “female auto pilot mode”.  Perhaps when women are at ease they share more about their personal life?  The researchers admit this as a possibility, but say it’s not likely given that they sound less competent….as assessed by people who didn’t know what competence sounded like.

One point not addressed at all in this study was the seniority of the people the participants were talking to.  In traditionally male dominated fields, it is likely that at least some of the men they ran in to were the chairs of the department, etc, meaning that these women were probably talking to a more intimidating group of men than women.  Women who talk heavily about research and less about personal lives may have run in to more senior faculty more often.  As the study took place over 3 days, it could conceivably be skewed by who people ran in to.  Additionally, I was wondering about the presence of mentoring and/or women in science type groups.  Women in science frequently meet other women in science through these groups, and there could have been some up front data skewing there.

It’s also important to note that for every measure of disengagement in the study, the results were between 1.5 and 2 (on a scale of 1 to 5).  While statistically significant, I do wonder about the practical significance of these numbers.  If asked whether you agree or disagree with the statement “I often feel I am going through the motions at work”, how accurately could you answer, on a scale of 1 to 5?

Overall this study seemed very chicken and egg to me.  I’m not convinced that it’s implausible that women simply share more of themselves at work, especially when they’re comfortable, as opposed to the sharing itself making women more comfortable at work (there’s nothing worse at work than an awkward overshare).   I’m still not sure I get where you’d extrapolate stereotype threat unless it was the explanation you’d already picked…..I did not see any data that would point to it independently.

I’d like to see a follow up study in ten years to see if these women did actually drop out at higher rates than their male colleagues, and what their stated reason for leaving was.  Without that piece, any conclusions seem incredibly hypothetical to me.  One of the things that drives me a bit bonkers when discussing STEM careers is very few people seem interested in what the women actually doing these careers think about why they choose what they do or do not do.  I’ve never seen a study that walked in to an English class and asked all the women why they weren’t engineers.  Likewise, if more of these women quit than the men, I’d like to see why they said they did it.  Then perhaps we can get in to the weeds, but won’t somebody tell me why women actually think they’re quitting?

I looked through the references section and couldn’t find a paper that addressed this question.

Anyway, I think it’s important to remember that when reading a study like this, you have to agree with all the steps before you can agree with the conclusions.  Is measuring snippets of conversations and having them coded by research assistants a valid method of determining how women function in the workplace?  Is 19 people a big enough sample size?  Should level of disengagement at work be controlled for possible outside events that might be causing them to feel less engaged?  Should the women in question be asked if they felt stereotype threat, or is that irrelevant?

Most importantly, should NPR have clarified that when they said “stereotypes can drive women out of science” they meant “theoretical stereotypes that may or may not have been there and that women may or may not have been afraid of….and none of these women had actually quit we just think they might?”.  You know, just hypothetically speaking.

What is STEM anyway?

I’ve been trying to work on a post about some further research on women in STEM fields, and I keep getting bogged down in definitions.  I am currently headed down the rabbit hole of what a “STEM job” actually is.

I found out some interesting things.  According to this report, my job doesn’t count as a STEM job, despite the fact that I work with nothing but math and science (alright, and some psych).  It’s not the psych part that excludes me however, it’s actually that I work in healthcare.  Healthcare, apparently is excluded completely.

So if I were performing my same job, with the same qualifications, in a different field, I’d have a STEM job.  Since I report in to a hospital however, I don’t have one.

Your doctor does not have a STEM job.  Neither does your pharmacist, dentist, nurse, or anyone who teaches anything on any level.  Apparently if you run stats for the Red Sox, you’re in a STEM job, but do the same thing for sick people, and it doesn’t count.

Fascinating.

Deadliest weapons and causes of death

There’s an apocryphal story in the international public health sphere about the time someone tried to figure out total mortality in Africa in any given year.  Apparently they went through the newsletters/press releases of  charities dedicated to various diseases, and found that if you added all the “x number of people die every year” numbers up, everyone in Africa died every year.  Twice.

While there’s likely some data inflation there, the other explanation is that it’s really hard to classify causes of death (I’ve covered some of this before).  Even with infectious disease, this can be tricky.  If an HIV positive person contracts tuberculosis and dies, do they go under HIV mortality, or tuberculosis?  If malnutrition leaves on susceptible to other infections, what’s the real cause of death?  How about a bad water supply that carries ringworm?

I bring this up because I saw a fascinating stat today over at the New Yorker (via Farnam St):

What Is The Most Effective Killing Machine Man Has Ever Seen?Mosquitoes.
There has never been a more effective killing machine. Researchers estimate that mosquitoes have been responsible for half the deaths in human history.Malaria accounts for much of the mortality, but mosquitoes also transmit scores of other potentially fatal infections, including yellow fever, dengue fever, chikungyunga, lymphatic filariasis, Rift Valley fever, West Nile fever, and several types of encephalitis. Despite our technical sophistication, mosquitoes pose a greater risk to a larger number of people today than ever before. Like most other pathogens, the ciruses and parasites borne by mosquitoes evole rapidly to resist pesticides and drugs.
via “The Mosquito Solution,” ($$$) The New Yorker, July 9 & 16, 2012, p. 40

Definitely made me a bit nervous, especially since it seems malaria, etc would actually be some of the more accurately counted causes of death.  So, um, take care of yourselves this summer, okay?

Political Arithmetic – Voter ID laws

Update: Link fixed

Last week I put up a post slamming an infographic on fair market rent between states.  I was interested in the AVIs response, which end with “These are advocacy numbers.  Not the same as actual reality.”

Advocacy and other political skewings of data are one of those things that shouldn’t bother me, but do.  
I read headlines, knowing that I’m going to be driven nuts but the presumptions and projections, and yet I read things anyway.  It’s a bad habit.
All that being said, I truly enjoyed Nate Silver’s examination of the real effect voter ID laws might have on voter turnout in various states. 
He attempts to cut through all the partisan hoopla and to do a one person point-counterpoint.  An example:

But some implied that Democratic-leaning voting groups, especially African-Americans and Hispanics, were more likely to be affected. Others found that educational attainment was the key variable in predicting whom these laws might disenfranchise, with race being of secondary importance. If that’s true, some white voters without college degrees could also be affected, and they tend to vote Republican.

He also makes a fascinating point about the cult of statistical significance:

Statistical significance, however, is a funny concept. It has mostly to do with the volume of data that you have, and the sampling error that this introduces. Effects that may be of little practical significance can be statistically significant if you have tons and tons of data. Conversely, findings that have some substantive, real-world impact may not be deemed statistically significant, if the data is sparse or noisy.

On the whole, he concludes it will swing in the Republican direction for this election, but reminds everyone:

One last thing to consider: although I do think these laws will have some detrimental effect on Democratic turnout, it is unlikely to be as large as some Democrats fear or as some news media reports imply — and they can also serve as a rallying point for the party bases. So although the direct effects of these laws are likely negative for Democrats, it wouldn’t take that much in terms of increased base voter engagement — and increased voter conscientiousness about their registration status — to mitigate them. 

The whole article is long but a great read about how to assess policy changes if you’re trying to get to the truth, rather than just prove a political point.

Moral obligations and Lazy Truth

I was going to include this in a Friday link post, but I really felt it deserved it’s own spotlight.  

There’s a new gmail gadget called “Lazy Truth” that promises to send you a fact check email every time you receive a (forwarded) email it deems to be of dubious content.

I haven’t tried it, so I’m not sure what it’s set up to flag, or how accurate the “fact check” email is, but I was immediately intrigued.  I’ve actually been working on a much longer post that covers just this topic, so it’s something I’ve been giving a lot of thought.

I’ve been mulling over the rise of Facebook/email/Twitter lately, and wondering…..for those of us who value our integrity and our truthfulness, and do not believe ends justify means, what exactly are the moral implications of hitting forward or share on information that we could have easily proven to be false if we’d checked?

I was wondering if I was the only one worried about this, when I came across a blog post from Dr Michael Eades.  He’s a pro-low carb physician, who spends much of his time critiquing nutritional research.  In a post about the book “The China Study”, he describes finding what he consider a great critique of it on another person’s blog.  Then this:

…. I had fallen victim to the confirmation bias.  My bias was that Dr. Campbell was wrong, so I was more than happy to uncritically accept evidence confirming his error without lifting a finger to double check said evidence myself.  I knew that if a blogger somewhere had come out with a long post describing an analysis of the China study demonstrating the validity of all of Dr. Campbell’s notions of the superiority of the plant-based diet, I would’ve been all over it looking for analytical errors.  But since Ms. Minger’s work accorded with my own beliefs, my confirmation bias ensured that I accepted it at face value. 

Once the fact that I had succumbed to my confirmation bias settled in around me, I became suffused with angst.  I had tweeted and retweeted Ms. Minger’s analysis a number of times, giving the impression that I had at least minimally checked it out and had approved it.  I had emailed it to a number of people, many of whom, I’m sure, had forwarded it on.  I’m sure I played a fairly large role in the rapid dissemination of the anti Campbell/China study info.

In the end, he went back and realized that the post was good, but his panic attack was intriguing to me.  How many of us have had this same panic?  How many of us should have?  How many lousy graphs rip through Facebook like wildfire because no one bothers to double check if they’re even valid?  Is the liar the person who created the graph, or do those who share it share some blame?

I don’t pretend I have an answer for this.  I feel most of the people interested enough to read this blog probably do not fall in the category of those who would easily share skewed information without thinking about it, but I am hoping for some thoughts/feedback from you all.

Are we so used to hearing politicians of all stripes seamlessly repeat bad data that we’ve come to view it as acceptable?  Is this just a fact of life?  Is it possible that we will be saved by widgets like the one above?   Does religion matter, or is this an overall moral issue? Does confrontation work with this sort of thing?  Or is this something I just have to learn to live with?