“How to Lie with Statistics”

Precise falsehoods exist for many reasons. Beware of the tricks.

Posted Nov 28, 2011

Here is one current example. Several weeks ago, Texas Governor Rick Perry made headlines when he pointed out that the affluent pay most of the Federal income taxes in this country while 47 percent of the earners pay none at all. He called this an "injustice". What's wrong with this statistic?First, it begs what should be the first question about any statistic: "Why would this be the case?"  The answer is that our income taxes are progressive. Those who have more pay more. So the tax burden simply reflects the highly skewed income distribution in this country. In fact, the (famous) top 1 percent of income earners took home 24 percent of the total national income in 2010, while the top 10 percent took home 49 percent. In contrast, 47.3 percent of our workers earned less than $25,000—close to, or less than, the official poverty line of $22,343 for a family of four.

The other problem with Governor Perry's fine sense of justice is that he was engaging in a statistical charade. It was a prime example of statistical cherry picking. If you look only at Federal income taxes paid you get a very different result than if you consider the total tax burden of our income earners. Counting payroll taxes, sales taxes, excise taxes for gasoline, and the like, you get a very different result. According to the Tax Policy Center at the Brookings Institution, in 2010 the top 1 percent paid a total of 30.8 percent of their income in various taxes, while the poorest 20 percent actually paid 16.3 percent, a lot more than "none at all."

Let's take a brief look at some of the many other kinds of statistical gamesmanship. One of the most common and well-known tricks could be called the magic of averages. Anyone with a basic knowledge of statistics knows that averages come in three different forms—mean, median, and mode. Thus, in the year 2000 the mean family income in the U.S. (total income divided by the number of families) was a respectable $45,000. But if we look at the median income (the mid-point between the highest and lowest), it was a less impressive $33,000. And if we look at the mode, where the largest number of family incomes was concentrated, it was an anemic $22,000.  Moreover, all of these averages mask the extreme distribution of wealth and poverty in the U.S. In 2006, the bottom quintile (20 percent of the households) earned less than $19,178. The top quintile earned more than $91,705. So anyone who uses the "average" family income in any form as a measure of our standard of living is gilding the nettle in order to make it look like a lily.

Another common form of statistical subterfuge could be called creative graphing. Say you want to impress an audience with the rise in crime rates. In actuality, the overall rate may only have increased ten percent over the past 20 years. But if you compress the scale on the horizontal axis, so that the line represents all 20 years, while stretching the scale on the vertical axis, so that the line represents ten increments of one-percent, the result will be a line with a steep upward slope even though the average increase is still only one-half percent per year.

Another kind of statistical legerdemain involves what I call crunching diversity. For decades, our public opinion polls have been asking a general question about the state of the nation, such as "Do you think the country is going in the right direction or the wrong direction, overall?" Or "Are you satisfied/dissatisfied with the way things are going in this country?" Currently, more than a dozen different polls show that the number of people who are satisfied versus dissatisfied is near the historic lows of 2008-2009. Here are some of the results: Rasmussen — 17-75%; CNN — 25-74%; Gallup — 12-86%; NBC/Wall Street Journal — 19-73%; CBS/New York Times — 21-74%; Time — 14-81%; Pew 17-79%. 

The country is obviously in a foul mood, but what does it mean? Partisans on both sides are prone to interpret the results in terms of their own political agendas, but a recent in-depth survey that asked people to name their gripes found there is a great diversity. Somewhat surprisingly, less than one-quarter named unemployment and the economy as their main concern. Other complaints included the wars in Iraq and Afghanistan, taxes, government regulations, the deficit, climate warming, illegal immigration, the wealth gap, and so on.  In other words, there is a long, bipartisan shopping list of dissatisfactions.

These are only a few examples of the many different kinds of statistical sleight of hand—small samples that are used to draw universal conclusions; estimates posing as facts; labels that shift their meaning over time (a recent example was a large company that claimed to have a doubled its wages in a year when in fact it shifted its workers from part time to full time at the same wages); irrelevant averages (you don't build houses for families with 3.14 people); spin doctored statistics; and, perhaps most nefarious, invented statistics (i.e., lies posing as hard facts). The social critic Ann Coulter, who is often loose with the facts, was caught in this famous example. In one of her books she claimed that President Ronald Reagan, despite various scandals during 1987, only saw a five-point drop in his approval rating, from 80% to 75%. Actually, it was a 16 point decline, from 63% to 47%, more than a trivial difference.

Finally, there is a class of bogus statistics that are simply a product of carelessness. One example, what could be called mindless extrapolation, was cited by sociologist Joel Best in his book, Damned Lies and Statistics. Best was startled to find this statistic in a reputable journal in 1995: "Every year since 1950, the number of children gunned down has doubled." How could this be? Best did a simple calculation. Even if there were only one child "gunned down" in 1950, by 1995 the number of doublings would have amounted to 35 trillion, truly an alarming statistic.

How can we defend ourselves against the statistical numbers game? Here are a few rules of thumb. Don't be too trusting. Always look at any gee-whiz statistic with skepticism. Ask who is the purveyor, and what are his or her (or its) motives? How were the numbers produced, and what do they really mean, or hide? If in doubt, try checking with one of the two internet lie-detector sites: PolitiFact.com or FactCheck.org. Or use a search engine like Google to check out the sources and methodologies, and (possibly) find other contrary evidence. It takes work, but there are no short-cuts in this mendacious age. The old saying "let the buyer beware" (caveat emptor) also applies to the daily onslaught of propaganda that may be masquerading as "information." It's important to be able to tell the difference.