Jouni pointed me to a forthcoming book on statistical graphics in R, written by Paul Murrell at the University of Auckland (New Zealand). R is the open-source version of S and by far the best all-around computer package for statistical research and practice.

Based on the webpage, the book looks like it’s going to be great. I was hoping to use it as one of the texts for my new course on statistical graphics, but now I’m thinking I’ll also include it as a recommended text in all my classes. I particularly like Figure 1.8 (the “graphical table”) which reminds me of my own work on turning tables into graphs.

I feel like there's been an uptick lately in new books that focus primarily on R. From my rather limited point of view, it also seems like non-academic statisticians are also moving to R from SAS. Just a few trends I've been observing-

I feel like SAS has become so focused on the enterprise that they're not doing the cutting-edge statistical computing work anymore, whereas just about everything interesting can be found in some R package.

I think that statisticians have become more savvy as programmers with every generation of graduate students. I remember working with a senior statistician who had used SAS all his life, and when I introduced him to R, he said, "I don't understand how you can do anything without macros!" Statisticians who have ever done work in C, Java, or scripting languages can appreciate the flexibility in data structures that R offers.

SAS has really become like an enterprise database with advanced mathematical/statistical functionality. Most people simply don't need all of the enterprise functionality (or the pricing) it offers and are better off with the combination of R and flat files or an open source DB like MySQL. I expect we'll see more books and more use of R over the next several years.

If the book is anything like the graphics, it's a keeper!

SAS vs. R? As long as there are enterprises doing statistics, there will be SAS. Many of our local employers (USAA, UT Health Science, US Army) run SAS shops, and a UTexas student license for SAS is only $90/year, so we teach it to make our grads more employable. BUT! R is a favorite among our biostatisticians, it's becoming better documented, and it's FREE–so guess what we're happily adding in our labs and coursework?

AG: "…my new course on statistical graphics"

Tell us more. Much, much more. I'm trying to convince my department to do something like this, and my collegues respond as though I had proposed learning Latin, or ballet.

Hey Mike- I graduated from UT with my master's in operations research in May '04- I lamented the lack of a real statistics PhD program while I was there. I'm starting up at UNC biostats in the fall, but I really miss Austin.

RE: SAS, UNC is a heavy SAS shop, and my wife works at a company that is also all SAS. But I'm seeing roughly the same trends you are- I know a group of cancer researchers that switched from SAS to R, and Duke has been an R/S-Plus oriented place for some time now.

Part of the reason that enterprises prefer SAS is the support- the really fabulous documentation and tech support. With R, you have the R mailing list and the risk of getting chewed out by Brian Ripley for asking about something that was covered in a footnote on page 591 of 840 on the R FAQ. Don't get me wrong, I learned a quarter of what I know from MASS, but I think Dr. Ripley has been answering stupid questions w/o getting paid for too long.

Re: SAS vs R. Not to mention the fact that, in some segments at least, SAS seems to have garnered some sort of "trusted" status–the Honored And Venerable SAS and all that.

I think that if R truly wanted to compete in that space it would need to improve the database support and some of the memory requirements (which have been improving or at least changing recently) along with maybe some sort of Red Hat-like company to provide the documentation and support (and, if the licensing could be worked out, an "Enterprise-level" user interface, that likely means point-n-click access to certain modules). For addressing SAS's "trusted" status you could go with something like the OpenBSD auditing projects—since the code is totally open you can actually do things like peer review audits. SAS also has pretty extensive reporting facilities, but I think that R's Sweave like tools will be better in the long run (for those who don't know about it, very handy. It lets you do things like include the commands for generating figures in the LaTeX document itself s.t. processing the LaTeX document automagically generates the figures, tables, etc and inserts the appropriate LaTeX code that gets passed on to the LaTeX toolchain to generate the DVI or PDF or whatever).

I've just been using the xtable package in R to generate LaTeX tables; they come out great.

I wonder if universities should teach their students multiple packages for employability purposes.

If you want a really powerful combination try the Sweave() function in R with LaTeX. All output, code, and figures can be directly incorporated into your latex file. It is amazing once you get the hang of it.