I love me a good infographic. But today it seems like we’re awash with pretty looking infographics that lack statistical literacy and/or don’t attempt to convey statistical literacy. A recent blog post in Atlantic Monthly reminded me of this issue since infographics are now being used by internet marketers to gain links to their sites.
Infographics are great because visual perception is the fastest way for people to absorb information. Infographics allow people to absorb large amounts of quantitative data much faster than reading text or browsing through data tables.
But there’s a downside to this: most of the time conclusions from data aren’t black and white. Data interpretation is littered with caveats, estimates, and inbuilt assumptions. Some are significant, sometimes are less so. But to know which is which from an infographic you must either have domain knowledge of the topic, or trust that the infographic creator is statistically literate.
Graphics are Cool, but Don’t Forget the Data Data Data!
Google guru on search quality Matt Cutts complains about this effect particularly on the web, where an attractive infographic can quickly go viral or be used to create links (important for bumping up the search engine rankings of a website):
“In principle, there’s nothing wrong with the concept of an infographic. What concerns me is the types of things that people are doing with them. They get far off topic, or the fact checking is really poor. The infographic may be neat, but if the information it’s based on is simply wrong, then it’s misleading people.”
Malcolm Gladwell -Edward Tufte Tug of War
One way to view this is as a Malcolm Gladwell-Edward Tufte tug of war. One side sees infographics as great ways to tell stories without needing to understand the numbers. The other side takes statistics as the starting point and focuses on making better looking visualizations. A good infographic creator is pulled in both directions and has to find a good compromise between representing statistical nuance while creating an easy to follow narrative.
Malcolm Gladwell loves to make sweeping generalizations from obscure data, whether it’s statistically valid or not. (see the Steven Pinker vs. Malcolm Gladwellintellectual smackfest several years ago to read about this). He comes from the perspective that people don’t understand or care to understand statistics, and his job is to be a storyteller who creates an intriguing narrative from interesting numbers—no matter how obscure—that he finds.
Edward Tufte, on the other hand, comes from academia and views his job as teaching people how to create good graphics from statistical data. His inbuilt assumption is that whoever creates the graphics already understands statistics but wants to learn how to distill conclusions into better graphics.
If you’ve read any of Tufte’s books or attended any of his lectures, you’ll notice he doesn’t talk about statistics much if at all. He doesn’t want to be your math teacher; he thinks you already know math but need to learn how to convert numerical findings into beautiful graphics.
Unfortunately in the real world this is rarely the case. People latch on to Gladwell’s “findings” because they’re fascinating. They make great stories but not particularly good science.
How Much Math Knowledge Do You Need Before Creating Infographics?
Which begs the question–how much math and statistics background do you need before I think you should be allowed to create infographics.
Maybe surprisingly, I don’t think you need to have a very strong math background. Even actual statisticians and full-time data analysts don’t know the nth detail behind every data set they use. For example, tons of people use Census data but few understand the nuances of survey sampling (I tried to read the census documents on this–not recommended!). But we trust the census has done it correctly and they’re transparent about which sources are mostly estimates and which ones are based almost entirely on actual responses.
All you really need to know for creating statistically informed infographics is two things:
- basic math knowledge (algebra and high school statistics count)
- the curiosity to want to understand where the numbers you use come from.
Curiosity is the key word here. Understanding any sort of data takes time. To absorb this data well enough to interpret it in an intellectually honest way takes even more time.
A Closing Example: Richard Florida On Gun Control
Take a look at this map of gun deaths by firearms that Richard Florida uses to argue that firearm deaths are “significantly lower in states with stricter gun control legislation.”
If you glance at it it looks pretty obvious that states with loose gun laws like Texas seem to have a much higher number of firearm deaths by population.
But dig more into his numbers and you find that deaths include all deaths from guns—suicides, murders, accidental shootings. But should we be including all of these in deciding whether to ban handguns because we want to reduce handgun crimes? Suicides may be committed by multiple methods so these might not be reasonable to include, and accidental deaths might not be good to include because by this logic we might want to say cars should be banned because of the number of self crashes that occur and private planes banned because the rates of pilot-induced crashes is so much higher than in commercial planes.
What do you think about the popularity of infographics today? Is it driving more people to become interested in math and statistics, or do you think it’s actually hurting our desire to understand these? I’d love to hear your comments.