## Here’s how rumors get started: Lineplots, dotplots, and nonfunctional modernist architecture

1. I remarked that Sharad had a good research article with some ugly graphs.

2. Dan posted Sharad’s graph and some unpleasant alternatives, inadvertently associating me with one of the unpleasant alternatives. Dan was comparing barplots with dotplots.

3. I commented on Dan’s site that, in this case, I’d much prefer a well-designed lineplot. I wrote:

There’s a principle in decision analysis that the most important step is not the evaluation of the decision tree but the decision of what options to include in the tree in the first place.

I think that’s what’s happening here. You’re seriously limiting yourself by considering the above options, which really are all the same graph with just slight differences in format. What you need to do is break outside the box.

(Graph 2-which I think you think is the kind of thing that Gelman would like-indeed is the kind of thing that I think the R gurus like, but I don’t like it at all. It looks clean without actually being clean. Sort of like those modern architecture buildings from the 1930s-1960s that look all sleek and functional but really aren’t so functional at all.)

The big problem with your graphs above is that they place two logical dimensions (the model and the scenario) on the same physical dimension (the y-axis). I find this sort of ABCABCABCABC pattern hard to follow. Instead, you want to be able to compare AAAA, BBBB, CCCC, while still being able to make the four separate ABC comparisons.

How to do this? I suggest a lineplot.

Here’s how my first try would go:

On the x-axis, put Music, Games, Movies, and Flu, in that order. (Ordering is important in allowing you to see patterns that otherwise might be obscured; see the cover of my book with Jennifer for an example.)

On the y-axis, put the scale. I’ll assume you know what you’re doing here, so keep with the .4 to 1 scale. But you only need labels at .4, .6, .8, 1.0. The intermediate labels are overkill and just make the graph hard to follow.

Now draw three lines, one for Search, one for Baseline, and one for Combined. Color the lines differently and label each one directly on the plot (not using a legend).

The resulting graph will be compact, and the next step is for you to replicate your study under different conditions, with a new graph for each. You can put these side by side and make some good comparisons.

5. Kaiser agrees with me and presents an excellent visualization showing why the lineplot is better. (Kaiser’s picture is so great that I’ll save it for its own entry here, for those of you who don’t click through on all the links.)

6. David Smith posts that I prefer the dotplot. Nooooooooooooooooooooooo!!!!!!!!!!!

1. Manolo says:

There is an issue with the line plot. I believe it is APA format that wouldn't "like" (allow?) to have a nominal variable on the x-axis on a line plot. The line gives the impression of continuity, unless the conditions were given in that order (rendering the variable ordinal), it might not be appropriate (according to APA) to use a line to represent the conditions.

2. It's even more obvious how all of these graphics are related if you look at them in the context of a grammar of graphics.

3. Bob Carpenter says:

I have very mixed feelings about the line plots.

On the up side, they make the overall performance for an approach clearer (e.g., showing that combined is best).

On the down side, line plots make the horizontal axis look scalar, when it's really categorical (e.g., the lines have a defined crossing point).

4. David Smith says:

Oh dear, and now I'm the guy that misrepresented Prof Gelman :). Apologies — the post in question has been updated accordingly.

5. Corey says:

Here's video accompaniment for your recent spate of "Nooooooooooooooooooooooo!!!!!!!!!!!". (You could even link a different vid each time you have a particularly negative reaction to something.)