And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!

Posted on December 13, 2013 11:16 AM by Andrew

We haven’t had one of these in awhile, having mostly switched to the “chess trivia” and “bad p-values” genres of blogging . . .

But I had to come back to the topic after receiving this note from Raghuveer Parthasarathy:

Here’s another bad graph you might like. It might (arguably) be even worse than the “worst graphs of the year” you’ve blogged about, since rather than being a poor representation of data, it is simply the plotting of a tautology that mistakenly gives the impression of being data. (And it’s in Nature.)

karlsson2013fig4a

Parthasarathy explains:

On the vertical axis we have the probability of being Type 2 Diabetic (T2D). On the horizontal axis we have the probability of being normal. There’s a clear, important trend evident, right? No! The probability of being normal is trivially one minus the probability of being T2D! The graph could not possibly be anything other than a straight line of slope -1. (For the students out there: the complete lack of scatter in the graph is a strong hint of something wrong.) What about the colors? They assign the data points for people with a > 50% probability of being T2D to be red, and the opposite to be green. The graph is simply plotting a tautology, that the probability of x is one minus the probability of not-x, together with a color scheme for labeling x. Paraphrasing Tufte, it has an information-to-ink ratio of approximately zero.

Not quite zero: what we seem to have here is a highly inefficient two-dimensional multicolor display of a one-dimensional set of 49 numbers, using dots that are so blurry that we can’t actually get much of a sense of their distribution. All joking aside, I’m guessing this graph would be much better if the x-axis were used for some relevant continuous variable (for example, people’s ages) and the colors used for some discrete variable (for example, some other indicator of health status).

Parthasarathy does add: “I’ll stress that the study itself is fascinating.” So, just to be clear, he’s criticizing the graph, not the underlying research.

5 thoughts on “And now, here’s something that would make Ed Tufte spin in his . . . ummm, Tufte’s still around, actually, so let’s just say I don’t think he’d like it!”

Rahul on December 13, 2013 12:08 PM at 12:08 pm said:

I’d be very wary of trusting the abilities of anyone associated with this paper (reviewers included) simply because how can you get such a straight line and not for a moment suspect something’s wrong? Beats me!
phonon on December 13, 2013 2:04 PM at 2:04 pm said:

To give the benefit of the doubt, I assume the point of the graph is the *distribution* of the dots along the line. That should be a histogram though…
Bob on December 14, 2013 6:22 AM at 6:22 am said:

Mutatis mutandis
Chris G on December 14, 2013 7:50 AM at 7:50 am said:

The information-to-ink ratio would’ve been higher if they’d captioned it “For those of you who don’t know what a plot of y=1-x looks like…” and drawn a line instead of those damn polka dots. Oh, but wait, maybe the dots are supposed to be uncertainty ellipses!
kevin denny on December 14, 2013 12:51 PM at 12:51 pm said:

Finally a conclusive proof that the way to increase the proportion of people who don’t have diabetes is to reduce the proportion who do.

Comments are closed.