Skip to content

My talk last night at the visualization meetup

It went pretty well, especially considering it was an entirely new talk (even though, paradoxically, all the images were old), and even though I had a tough act to follow: I came on immediately after an excellent short presentation by Jed Dougherty on some cool information and visualization software that he and his colleagues are building for social workers.

The only problems with my were:

(a) I planned to elicit more audience involvement but didn’t do it. It would’ve been easy: at any point I could’ve just paused and had the audience members work in pairs to come up with suggested improvements to any of my graphs. But I forgot to do it.

(b) I went on too long. The talk was going so well, I didn’t stop. In retrospect, it would’ve been better to stop earlier. Better for people to leave the table hungry than stuffed.

Also, next time I’ll drop the bit about the nuns-in-prison movies. People weren’t getting the connection to the point I was making about presetting the signs of variables before presenting an analysis.

And one new thing I tried: I know the slides will be read online, so I kept captions for many of the figures so you online readers can get more out of it, also I added a couple slides (pages 10 and 60) just for you, to clarify some of the points you otherwise wouldn’t get just by seeing the images.

Again, here are the slides.


  1. Jed says:

    Keep the Nun quip. I thought it slayed.

  2. Anonymous says:

    Ok. It works. Hi.. I would like to know what you think, or what your comunity thinks, of Hans Rosling’s animated Gapminder?

    Can you reply directly to my mail cause I have no link to this site.


  3. Rahul says:

    Great advice. I agree and try to adopt these ideas about good graphs. But are there any studies showing the efficacy of the prescriptions? Even targeting a non-casual audience do people absorb / retain Gelmanian (or Tufteian) “good” graphs better than “bad” ones? Is the chance of mis-reading reduced. How much is the difference?

    I ask because there’s so many contrasting points of view as to what makes a good graphic; some of them directly opposing. In all this subjective discussion some measure of empirical study would be highly welcome (IMHO). The only data point I know is that study which actually attributed higher recall to chart-junk but that isn’t exactly representative.

    • Andrew says:


      I agree this is a concern and I wish I’d raised it during my talk. My quick answer is that the first audience for any of my graphs is . . . me! I am interested in doing research on graphical perception, and one thing I think will be necessary in such studies is to closely tie the measurements of perception to my ideas about presentation.

      For example, I prefer the graphs on page 14 of my presentation to that on page 13 because it allows more comparisons to be made. But, in addition, I’ve cleaned up the presentation in various ways. For example, I reduced the number of labels on the y-axis. This would make it more difficult to answer a standard quiz question such as, “What is the level for “male, $25,000+, under 65”? But I think it makes the important comparisons clearer. But I imagine different people read these graphs in different ways. Perhaps for some readers the graphs on page 14 are too abstract; maybe for these people is necessary to see all the numbers as an intermediate step. Similarly, I removed some clutter by labeling lines as “High income” and “Low income” rather than giving exact (actually, made-up, as this was a fake example) income categories. But perhaps, for some readers, my clean-up merely adds confusion because they would be hung up on wondering what exactly do the income categories mean. This sort of thing is one reason I like multiple displays, and in fact I’d also be happy to have my graphical display link to a table with all the numbers and categories being displayed.

      My main point in this comment, though, is that we can’t really get to all these issues if we just run an experiment with different graphs and a quiz at the end, without thinking carefully about the goals of the graph-readers and the content of the quiz.

      • Rahul says:

        Thanks Andrew. I agree that the experiments may not be easy but I think they are sorely needed.

        We have too many smart and influential people right now devoting time to how to make good visuals yet too few (maybe none) doing any empirical analysis.

        One sector to look up to might be Google / Yahoo et. al who do a lot of statistical, empirical analysis of how minor tweaks to the Web-UI and other design elements change their retention and click rates. Not identical but I think we can learn a lot from those methods.

        Another field we ought to be borrowing from is cognitive scientists: does the eye register horizontally faster or vertically? Which colors palettes have most recall? How many maximum elements (e.g. data series) can the average subject mentally process?

        My fear is if we don’t adopt a more data driven approach soon, efficient visualization is turning into ideology (already, the Tufte cult etc.) whereas it ought to be a science.

  4. Rahul says:

    A plotting question: Is there a scatter plot routine in R or otherwise that automatically does label de-clashing?

    To illustrate what I mean see the state labels clustered and on top of one another on Andrew’s slide labelled “1998: Estimating the probability of events that have never occurred: When is your vote decisive?””

  5. […] that I tend to be big on visual presentation. This debate is not new. In fact, Andrew Gelman has a great couple of posts and slides on the difference between scientific data presentation (i.e. graphs) and info […]