Statistical Skepticism and Uncautious Reporting
This post isn’t about hockey statistics (which is likely how most of you reading this know about me) but I do think it’s relevant to that topic. I also think it’s relevant to reporting on statistics and academic work in general. As you probably know if you follow me on Twitter or read my posts at Pension Plan Puppets, I like data. I think its important to try to gain an understanding of the topics that we talk about that isn’t tainted by selective memory or other personal biases, and statistics can be a huge help in that regard. But in addition to being skeptical about our subjective memory, I also think it’s important to be skeptical of supposedly objective evidence like statistics. Data can be a useful tool in helping to understand our world, but it ought to always pass a simple test first, which is “Does the result make sense?” If the result doesn’t make sense, then we need to try to understand what’s going on beneath the numbers before we start discussing them as though they’re the truth.
A study on the prevalence of rape in some parts of Asia was published in one of the world’s most prestigious medical journals, The Lancet, this week. The study claims to show that 24% of men in the studies examined are rapists, a figure that climbs to a remarkable 59% in Papua New Guinea! This study has been widely reported in virtually all mainstream media outlets, such as The Washington Post, CBS, Bloomberg, the CBC, and Foreign Policy. No media outlet that I could find showed any skepticism about any of the things that I’m about to discuss.
[The statistics mentioned in the next section were all taken from Table 2 in the Lancet study, this NY Times infographic on world prison populations, or the CIA World Factbook or calculated myself from numbers provided by those sources.]
One stat jumped out at me in one piece that I read on this study: 23% of the men surveyed who claimed to have committed a rape also claimed to have served time in prison for it. If true, that would be astounding; if 1/4 men committed a rape and 1/4 of them went to prison that would mean one out of every 16 men in the countries surveyed had been convicted of rape in a court of law and sentenced to prison for it!
To show you just how preposterous that is, here’s a chart showing the male population and actual prison population in each country and the number of men who would have at some point served a prison sentence for rape if the rates of incarceration reported for each country in the Lancet study were correct:
One caveat here is that for simplicity I used the full male population in each country while the Lancet study only used males 18-49, but I don’t think that should change the overall picture tremendously.
This data presents a very unusual picture. While the figure for Sri Lanka may be plausible given that not all people who have ever served prison sentences are serving them right now, the other numbers are plainly absurd. Papua New Guinea only has 4000 people currently in prison and yet 2 million men from that country – nearly one out of every three men! – have served time in prison for having committed rape if the Lancet’s numbers are to be believed. The numbers are implausible for other countries too, even though none of the others are as extreme.
Now, if I had developed a survey that showed these kinds of results, I would be very skeptical about the reliability of any of my data. I can see only two possible scenarios here:
1. The survey’s respondents broadly can not be trusted to have answered these questions accurately.
2. The survey’s respondents provided unreliable data only for the question on incarceration and thus the rest of the study is fine.
Which of these sounds most plausible? Can you think of an interpretation other than #2 that would leave the general reliability of the survey intact despite the plainly nonsense results of the question on incarceration?
One possible criticism of what I’ve done here is that I’ve not subjected the NY Times or the CIA’s data to the same degree of skepticism as the study from The Lancet. I think that the CIA’s data on basic statistics like population size are very reliable and they’re widely used by reputable sources, so I think those numbers are trustworthy. The NY Times does not list its data sources for its prison figures, so some degree of skepticism may be warranted on that end, and on top of that countries can not necessarily be trusted to accurately report their prison populations. That being said, while the numbers may not be entirely accurate, the degree to which they would have to be off in order for my conclusion to change is so drastic that I think they’re acceptable for the purpose I’ve used them for.
I think I’ve demonstrated that this study is in general probably not very trustworthy. The responses to the question about incarceration are just so implausible that it seems to throw the entire thing into question. And yet the results were reported as fact by just about all of the major media organisations (in addition to a number of smaller ones).
Most of the articles on this topic link it to the tragic gang rape and murder of a young woman on a bus in India earlier this year, despite the fact that India was not among the countries surveyed. The statistic that 1/4 of all Asian men are rapists has now entered the collective knowledge and now contributes to North Americans’ understanding of Asian culture in general (and will almost certainly be linked to India specificially) despite the fact that there’s very good reason to believe that it’s not accurate. Even the simplest of skepticism about the study, just a little bit of looking into its numbers, would easily have shown that it could not be trusted. And yet not one media organisation went to that effort, so now a lot of people believe something that probably isn’t true.