Zero is zero

Nathan Roseberry writes:

I thought I had read on your blog that bar charts should always include zero on the scale, but a search of your blog (or google) didn’t return what I was looking for. Is it considered a best practice to always include zero on the axis for bar charts? Has this been written in a book?

The idea is that the area of the bar represents “how many” or “how much.” The bar has to go down to 0 for that to work. You don’t have to have your y-axis go to zero, but if you want the axis to go anywhere else, don’t use a bar graph, use a line graph. Usually line graphs are better anyway.

I’m sure this is all in a book somewhere.

1. I like the clarity of this post's title. But can a tautology have value?

2. Nick Cox says:

I agree with Andrew's main point.

But it can be reasonable to have bars starting at some base other than 0. The point is simply that bars should start at a natural reference level, which is not always 0.

I sat through a talk in which sex ratios for various states of India were shown starting at 0. This rather weakened the speaker's main point that sex ratios are typically quite different from unity. Using 1 (or 100%, etc.) as base would have improved that graph. (Using a dot chart would have worked as well or better too.)

Similarly, bar graphs are sometimes useful for time series in which it is important to distinguish periods above or below average, or periods above and below freezing in the US where many people still use 32 deg F for freezing point. In the last case 0 deg F as base would be silly, although I could quote published examples.

In terms of books, Darrell Huff's "How to lie with statistics" is dogmatic about starting at zero. William Cleveland's and Leland Wilkinson's books give more nuanced advice.

3. Tian says:

The "stats: data and models" has a general comment related to this, if I am not mistaking.

4. Jorge Camoes says:

For example:

Stephen Few's Now You See It, page 60
Naomi Robbins's Creating More Effective Graphs

5. It is many books. See pages 239 – 240 of Creating More Effective Graphs by Naomi B. Robbins (Wiley, 2005)or many other books.

6. Dave says:

"You don't have to have your y-axis go to zero, but if you want the axis to go anywhere else, don't use a bar graph, use a line graph."

…as long as your x-axis is a continuous variable? Or is there no caveat on that?

7. Kyle says:

It's in Tufte ("/The visual display…"). A more nuanced view is in Cleveland (probably "Elements of graphing data", but maybe "Data visualization")

8. Andrew Gelman says:

Nick (and others):

Yes. The zero is zero rule is a guideline. Other starting points can work too, but I think it's a good idea to say that you have to justify such alternative choices.

9. Phil says:

Nick, you say "I sat through a talk in which sex ratios for various states of India were shown starting at 0." If this was a bar plot, then it violates the principle that bar plots should be used (only) for showing "how much" or "how many". They should not be used for ratios.

10. Nick Cox says:

Phil: Says who? This is to me a distinction without a difference, and a dogma without a rationale (pun intended).

How many females per male is me a matter of "how much" or "how many" too, so the example is consistent with your own principle.

Many quantities can be construed as ratios anyway, even it is in the sense of amount of stuff/unit used to measure stuff. In an old and still much used classification of measurement scales (S.S. Stevens), "ratio scale" is the highest scale of measurement, one in which zeros are not arbitrary, and ratios make sense. Sex ratios make sense.

11. Fernando DePaolis says:

…line charts are better, only when it's appropriate…distributions (i.e. counts) of categorical data should not be represented with line charts…