4 Ways Statistics Can Be Misused – and How to Avoid Them
In a quote that is widely attributed to George Bernard Shaw, Bertrand Russell once said that an ideal character might “go so far as to enable a man to be moved emotionally by statistics.”
It’s easy to see why this degree of sympathy might not be accessible by most of us. Another famous quotation runs, “a single death is a tragedy; a million deaths is a statistic”; it’s usually attributed to Stalin but is probably apocryphal. Wherever it comes from, it gets requoted endlessly because it is both shocking – a million deaths certainly ought to be viewed as a tragedy – and it has a core of truth. Humans evolved in small groups; past a certain size, the difference between large numbers is very hard for us to conceptualise. A million is a thousand times smaller than a billion, but as every newspaper subeditor knows, we get them confused all the time.
It’s easy enough to find shocking statistics – even heartbreaking ones. 80% of those dying of malaria in Africa are children. 10 million children die before their 5th birthday. More refugees have died at sea trying to make it to Europe in January 2016 as in the two preceding Januaries put together. But would it alter your perception significantly if you were told that it was 8 million children who die before their 5th birthday, or 15 million? Many statistics, no matter how affecting, don’t engage us closely with the issue at hand. That’s why they’re so easily misused or just plain made up. Here’s our guide to using statistics more wisely, and the errors to avoid.
1. “He uses statistics as a drunken man uses a lamppost – for support, not illumination.”
Another quote of dubious origin demonstrates the biggest sin in use of statistics. You have a point to make but not much by way of reasons to back it up? No problem; use some statistics.
The TV series Yes, Prime Minister demonstrated how this can be done beautifully after a poll is released showing that a majority of the public favour reintroducing conscription. Sir Humphrey (a senior civil servant) asks his colleague, Bernard, to commission a poll showing the opposite – and then shows how this is done. He asks Bernard a series of questions (“Are you worried about the rise in crime among teenagers?”; “Do you think young people welcome some authority and leadership in their lives?” and so on) to get Bernard to agree that conscription should be reintroduced, and then another series (“Do you think there is a danger in giving young people guns and teaching them how to kill?”; “Do you think it is wrong to force people to take up arms against their will?”) to get him to agree to the exact opposite. Bernard, he concludes, is “the perfect balanced sample.”
It can be tricky to know how to support a point otherwise. After all, you’ve probably heard that the plural of anecdote is not data (or in other words, a collection of anecdotes doesn’t add up to a consensus). And while it’s fun to argue on the basis of thought experiments and deduction, at some point you will probably want some concrete evidence. And so you turn to statistics. Make sure the statistics you choose don’t break any of the other rules on this list and you’ll be off to a good start.
2. Sample sizes matter
100% of people writing this article are hungry.
There is, however, only one person writing this article, so this is not a useful statistical guide to the general hunger levels of Oxford Royale Summer Schools writers. This much seems obvious. But it doesn’t stop people from coming up with statistics based on all kinds of absurd sample sizes. For instance, it was gleefully pointed out on Twitter that of all the trolls imprisoned for the harassment of feminist campaigner Caroline Criado-Perez, 50% were female. Evidence, then, that Criado-Perez’s views were opposed by women just as much as by men; that the harassment she faced couldn’t really be down to sexism; that in fact, it was probably altogether justified. The only problem? A grand total of two people were imprisoned for harassing Criado-Perez. One of them was female.
The Criado-Perez example is an extreme one, but this crops up repeatedly. Think about the shampoo adverts where of 50 women surveyed, 70% reported that their hair had increased volume and shine. That is not very many!
Not just the size but the type of sample matters; polling companies failed utterly to predict a Conservative majority in the 2015 British General Election, which has been put down partly to finding Labour-dominated samples. Conservative voters are likely to be busier (so not at home to answer the door for door-to-door polling) and less likely to be active online (so not responding to online polls). Not accounting for that difference when weighting scores can have pretty extreme consequences. Unweighted polling currently shows Labour and the Conservatives neck-and-neck. Allow for the need to weight the data, and the Conservatives end up with a lead of seven or eight percentage points.
One of the worst offenders here are psychology statistics. Psychology students are required for their university courses to carry out a lot of research, and they also get credit for taking part in research. This can result in research that consists of psychology undergraduates studying other psychology undergraduates; which is an excellent way of learning about the psychology of psychology undergraduates, but terrible when applied to the wider population. It’s worth noting that this is not the fault of the undergraduates themselves, but of anyone who comes across this data and decides to report it as if it applied to the world as a whole.
3. Bells and degenerates
These are types of distributions, not an album title (sadly). A bell curve usually refers to a normal distribution, which is the easiest kind for most non-statisticians to imagine. Think of a class of students of roughly similar ability. They take a test and the average score out of ten is five. Most of them get 5 out of 10, some get 4 or 6 out of 10, and a couple get 3 or 7 out of 10. If you were to plot their scores on a graph and draw a line through it, the line would form the shape of a bell. Hence, bell curve.
One of the problems that you might encounter is that when you say – for instance – “the class average was five out of ten”, the natural assumption is a bell curve with results like the ones described above. But that could also be the case if about half the class scored 0 or 1 out of 10, and the other half got 9 out of 10 or full marks – an inverted bell that would probably not be what you originally pictured.
Similarly, if you were told that you were going to be giving a speech to a group of people with an average age of 60, you might assume you’d be addressing a group of older people and tailor your speech accordingly. You might then be surprised to see a group of people with ages ranging from toddlers to very elderly people, with, say, a large contingent of teenagers. The average age could still be 60, while the range could be much greater than you’d expect.
It’s also worth thinking about the different types of average. Normally, when we say “average”, we mean the mean – all the numbers added together, divided by the total number of numbers. So we might have a group of people at a knitting circle who are aged 55, 67, 82, 89, 13, 13, 13, 56 and 0.5. The mean of their ages would be 43 – an age in a decade that no one represents. The modal average age – the most common age – is 13, which doesn’t seem right either. The median average is worked out by putting all the numbers in order and finding the middle one, so here it would be 0.5, 13, 13, 13, 55, 56, 67, 82 and 89 – and the middle age is 55, which does feel more appropriately representative of the group as a whole. But ultimately, if you were going to describe the knitting group, you’d probably say something like, “it’s mostly older people, a few teenagers, and one person sometimes brings a baby” – an arrangement that refuses to fit into a bell curve and that statistics struggle to describe.
4. Statistics are like a red, red rose
The stock market has been turbulent lately and you might well have seen headline like “Dow Jones plunges 276 points”. If you’re a regular reader of the business news and you know how much variation there usually is in the number of points up or down that the Dow Jones closes, then this is a useful figure for you – at least, it enables you to know just how much you need to worry.
However, for many of us this kind of statistic is utterly useless because we have no baseline to compare it to. You end up second guessing what sounds like a big number and can end up in a considerable muddle. Think about a tabloid headline – unemployment at such-and-such a percentage, literacy rates at such-and-such a percentage, this number of violent crimes per year – all of which could sound very alarming and make you conclude that everything is going downhill and the apocalypse is nigh.
This is unless you know that unemployment is at a 10-year low, literacy is higher than ever and the rise in violent crime is not because more crimes are being committed but because far more violent crimes are being reported; in other words, there are some crimes that people previously hadn’t been reporting to the police – perhaps because they thought that no action would be taken or because they didn’t think it was serious enough to bother – that they now are telling the police about. That doesn’t seem like a country in chaos. But you have to have some context, some point of comparison, to be able to tell.
Worse than no comparison, however, is a bad comparison; at least having no comparison might encourage the reader to seek out their own. You can witness this any time comparisons between countries are made that are not on a per capita basis. So for instance, Germany has been lauded for its promise to take in 1.5 million refugees. Lebanon has taken in about 1 million refugees. On that basis, Germany’s commitment looks more impressive; however, Germany has a population of 80 million to Lebanon’s 5 million, and a GDP of $3.7 trillion to Lebanon’s $44 billion.
Or take Richard Dawkins’ controversial tweet that Trinity College, Cambridge has won more Nobel prizes than “all the world’s Muslims”. He defended himself by saying he was just pointing out a fact. But Trinity College, Cambridge graduates and academic staff have also won more Nobel prizes than Japan, Russia or the continent of Oceania – it is not that the world’s Muslims are punching significantly below their weight in terms of Nobel prizes, but that Trinity College is peerless in its success.
On the topic of Nobel prizes, there’s a strong positive correlation between the number of Nobel prizes the people of a country have earned and the quantity of chocolate that the people of that country eat annually. It’s an obvious point, but one that cannot be made often enough: this does not show that eating more chocolate will earn you a Nobel prize, nor that being surrounded by a greater number of Nobel prize winners give you more of a hankering for chocolate. Correlation does not imply causation. Here it’s the case that the countries that eat the most chocolate are either the ones near where the Nobel committee is based (Sweden, Denmark, Norway), wealthier countries where chocolate is inexpensive and easy to get hold of, and Western countries generally where chocolate is a favourite treat, which tend to be wealthier and therefore have more money to invest in education and research – resulting in more Nobel prizes. But this tenuous connection between chocolate-eating and Nobel prizes is greater than the connection between some correlated values. This page has a selection of statistics that show a correlation entirely by chance.
Have you come across any hilarious example of the misuse of statistics? Let us know in the comments!
image credits: laptop and desk, giraffes, dog and man, bell curve, dwarves.