What’s wrong with a kernel density? Too opaque a connection with the data? I [Anne] have had some unpleasant surprises using histograms lately, so I’ve been trying to get a feel for the alternatives.
My reply: Here are my problems with kernel densities (in this example, and more generally):
1. Annoying artifacts, such as all-positive quantities whose kernel density estimates go into the negative zone. This can be fixed, but (a) it typically isn’t, and (b) when there isn’t an obvious bound, you still have the issue of the kernel density including places that it shouldn’t.
2. It’s hard to see where the data are. As I wrote in my blog linked above, I think it’s better to just see the data directly. Especially for something like vote proportions that I can understand pretty well directly. For example, when I see the little peak at 3% in the density in Figure 2 of Chen and Rodden, or the falloff after 80%, I’d rather just see what’s happening there rather than trying to guess by taking the density estimate and mentally un-convolving the kernel.
3. The other thing I like about a histogram is that it contains the seeds of its own destruction–that is, an internal estimate of uncertainty, based on variation in the heights of the histogram bars. See here for more discussion of this point, in particular the idea that the goal of a histogram or density estimate is not to most accurately estimate the true superpopulation density (whatever that means in this example) but rather to get an understanding of what the data are telling you.