How to make a certain interactive graph in R (or other convenient software)?

Dana Kelly writes, “Here’s a link to a NYT article on trends in commercial aviation accident rates. I particularly liked the interactive graphic in the article. Do you know how such graphics can be constructed in R?” The short answer is no, I don’t know how to do it. But maybe someone who is reading this knows?

11 thoughts on “How to make a certain interactive graph in R (or other convenient software)?

  1. Did Ms. Kelly want the look of the graphic, or the interactivity? The look of the graphic – i.e. the boxes scaled according to such-and-such, etc – should be do-able in R, though I would have to put some work in to it. (A variation on a barplot() with matrix input, perhaps?) I doubt there's anything off the shelf that would replicate that exactly.

    But if it's the interactivity (i.e. the rollover text) she's after, then my first thought would be that she's out of luck – a look at the page source indicates that the graphic was put together in Flash, and I can't think of a way to do anything similar in R.

  2. It's not easy, as R isn't really set up for interactive graphics. You can use rggobi + GGobi or iplots to get an interactive histogram, but neither supports directly querying individual cases which make up a bar. If you wanted to learn more about creating interactive graphics for the website processing or prefuse might be a good place to start. Also see many eyes for a ready-made alternative (although without the colouring).

    If you just want the static graphic, you can create it with ggplot2 along the following lines

    <pre>qplot(wt, data=mtcars, geom="histogram", fill=cyl, group=cyl)</pre>

    Hadley

  3. That particular graphic appears to be some fancy Javascript thing that the NYT licensed. If you were going to try to do it in R, your best bet would probably be something like one of the SVG graphics devices. For example, gridSVG may allow you to build a grid plot and attach the mouseover events.

  4. Actually, the graphic was made in Flash as Winawer said, and I'm pretty sure The Times made that themselves (i.e. Amanda C, credited on bottom).

    I think the way I'd do it is first in R for the bins and then finish it off in Flash for interactivity.

  5. The graphic is not the best that could be done. It emphasises the rate of accidents with fatalities per year not the rate of fatalities. I haven't done the maths yet, but I think 1994 and 1996 were the worst years, though it is hard to see that. The ordering by date through the year doesn't really contribute much and could maybe be used more effectively.

    What crashes do the data cover? I presume just American airlines, wherever they fly. If so, that should be stated somewhere. Interestingly, accidents not associated with flights are included. Is that appropriate?

    As for the interaction, it could be done in Mondrian and has not yet been implemented in iplots.

  6. Antony, I don't think that the interaction could be done in Mondrian, as bars in the histogram aggregate all applicable cases. i.e. when you query a bar in a histogram, you get aggregate info about all cases, not individual cases.

    I agree that weighting by number of fatalities would substantially improve the graphic.

  7. Hadley,

    The data wouldn't be drawn as a histogram in Mondrian, it would have to be some variation of a weighted mosiacplot. Don't forget the extra issue that the boxes are not the same size, as they are affected by the number of flights per year. Instead of talking about how to draw the NY Times plot, we should be suggesting something different and better.

    Ken,

    I agree with the importance of linking, but why use all that R code for a scatterplot, when the iplots package is available?

  8. Anthony, regardless of what plot you use, I don't see you how you can get individual level querying in an aggregated plot. Perhaps you are seeing something that I'm missing.

Comments are closed.