Skip to content
Archive of posts filed under the Statistical graphics category.

Somebody’s reading our research.

See footnote 10 on page 5 of this GAO report. (The above graphs are just for age 45-54, which demonstrates an important thing about statistical graphics: They should be as self-contained as possible. Otherwise when the graph is separated from its caption, it requires additional words of explanation, as you are seeing here.)

Numbers too good to be true? Or: Thanks, Obama!?

This post is by Phil. The “Affordable Care Act” a.k.a. “Obamacare” was passed in 2010, with its various pieces coming into play over the following few years. One of those pieces is penalties for hospitals that see high readmission rates. The theory here, or at least one of the theories here, was that hospitals could […]

Thinking about this beautiful text sentiment visualizer yields a surprising insight about statistical graphics

Lucas Estevem set up this website in d3 as his final project in our statistical communication and graphics class this spring. Copy any text into the window, push the button, and you get this clean and attractive display showing the estimated positivity or negativity of each sentence. The length of each bar is some continuously-scaled […]

Graphical Data Analysis with R

Graphical Data Analysis with R: that’s the title of Antony Unwin’s new book. Here are the chapter titles: Ch01 Setting the Scene Ch03 Examining continuous variables Ch04 Displaying Categorial Data Ch05 Looking for Structure Ch06 Investigating Multivariate Continuous Data Ch07 Studying Multivariate Categorical Data Ch08 Getting an Overview Ch09 Graphics and Data Quality Ch10 Comparisons […]

Job opening . . . for a data graphics editor!

Larry Wheeler writes: I’m the managing editor at Health Affairs, a monthly peer-reviewed journal about health policy. We publish a lot of statistical graphics submitted with manuscripts from academic, industry, and government researchers. We have a job opening for a new position we’re calling “data graphics editor.” I’ve been having trouble attracting the right kind […]

I wish Napoleon Bonaparte had never been born

Not just for all the usual good reasons why the world needs fewer mass murderers, but also for the very specific reason that, had there been no Napoleon, there’d be no Napoleon-in-Russia graph, then no shining example for Ed Tufte to illustrate how people should make their graphs, then maybe graph-makers wouldn’t all feel that […]

Where the fat people at?

Pearly Dhingra points me to this article, “The Geographic Distribution of Obesity in the US and the Potential Regional Differences in Misreporting of Obesity,” by Anh Le, Suzanne Judd, David Allison, Reena Oza-Frank, Olivia Affuso, Monika Safford, Virginia Howard, and George Howard, who write: Data from BRFSS [the behavioral risk factor surveillance system] suggest that […]

This graph is so ugly—and you’ll never guess where it appeared

Raghu Parthasarathy writes: I know you’re sick of seeing / being pointed to awful figures, but this one is an abomination of a sort I’ve never seen before: It’s a pie chart *and* a word cloud. In an actual research paper! Messy, illegible, and generally pointless. It’s Figure 1 of this paper (in Cell — […]

One quick tip for building trust in missing-data imputations?

Peter Liberman writes: I’m working on a paper that, in the absence of a single survey that measured the required combination of variables, analyzes data collected by separate, uncoordinated Knowledge Networks surveys in 2003. My co-author (a social psychologist who commissioned one of the surveys) and I obtained from KN unique id numbers for all […]

If you’re using Stata and you want to do Bayes, you should be using StataStan

Robert Grant, Daniel Furr, Bob Carpenter, and I write: Stata users have access to two easy-to-use implementations of Bayesian inference: Stata’s native bayesmh function and StataStan, which calls the general Bayesian engine Stan. We compare these on two models that are important for education research: the Rasch model and the hierarchical Rasch model. Stan (as […]

“Earlier you had waxed nostalgic for the days when people sent you bad graphs . . .”

Nadia Hassan writes: Earlier you had waxed nostalgic for the days when people sent you bad graphs. This [from Javier Zarracina] is not a stand-out on that front, but it is far from ideal: A lot of buzz in recent years about data journalism or quantitative journalism. There is a lot of issues to be […]

Citation shocker: “The lifecycle of scholarly articles across fields of economic research”

David Backus writes: Check esp fig 2 here. He was pointing me to a post by Sebastian Galiani, Ramiro Galvez, and Maria Victoria Anauati called The lifecycle of scholarly articles across fields of economic research. And here’s fig 2: And, as usual, I duck all the interesting questions and move toward triviality: This should be […]

First, second, and third order bias corrections (also, my ugly R code for the mortality-rate graphs!)

As an applied statistician, I don’t do a lot of heavy math. I did prove a true theorem once (with the help of some collaborators), but that was nearly twenty years ago. Most of the time I walk along pretty familiar paths, just hoping that other people will do the mathematical work necessary for me […]

Just Filling in the Bubbles

Collin Hitt writes: I study wrong answers, per your blog post today. My research focuses mostly on surveys of schoolchildren. I study the kids who appear to be just filling in the bubbles, who by accident actually reveal something of use for education researchers. Here’s his most recent paper, “Just Filling in the Bubbles: Using […]

The Rachel Tanur Memorial Prize for Visual Sociology

Judy Tanur writes: The Rachel Tanur Memorial Prize for Visual Sociology recognizes students in the social sciences who incorporate visual analysis in their work. The contest is open worldwide to undergraduate and graduate students (majoring in any social science). It is named for Rachel Dorothy Tanur (1958–2002), an urban planner and lawyer who cared deeply […]

Hi-tech hoops: Characterizing the spatial structure of defensive skill in professional basketball

Joshua Vogelstein points me to this article by Alexander Franks, Andrew Miller, Luke Bornn, and Kirk Goldsberry and writes: For some reason, I feel like you’d care about this article, and the resulting discussion on your blog would be fun. Hey—label your lines directly! Cool! Ummm . . . no. No. Really, really, really, really […]

3 postdoc opportunities you can’t miss—here in our group at Columbia! Apply NOW, don’t miss out!

Hey, just once, the Buzzfeed-style hype is appropriate. We have 3 amazing postdoc opportunities here, and you need to apply NOW. Here’s the deal: we’re working on some amazing projects. You know about Stan and associated exciting projects in computational statistics. There’s the virtual database query, which is the way I like to describe our […]

Syllabus for my course on Communicating Data and Statistics

Actually the course is called Statistical Communication and Graphics, but I was griping about how few students were taking the class, and someone suggested the title Communicating Data and Statistics as being a bit more appealing. So I’ll go with that for now. I love love love this class and everything that’s come from it […]

Jason Chaffetz is the Garo Yepremian of the U.S. House of Representatives, and I don’t mean that in a good way.

Mike Spagat and Paul Alper points us to this truly immoral bit of graphical manipulation, courtesy of U.S. Representative Jason Chaffetz. Here’s the evil graph: Here’s the correction: From the news article by Zachary Roth: As part of a contentious back-and-forth in which Chaffetz repeatedly cut off [Planned Parenthood president Cecile] Richards, the congressman displayed […]

An unconvincing analysis claiming to debunk the health benefits of moderate drinking

Daniel Lakeland writes: This study on alcohol consumption (by Craig Knott, Ngaire Coombs, Emmanuel Stamatakis, and Jane Biddulph) was written up in the BMJ editorials as “Alcohol’s Evaporating health benefits.” They conveniently show their data in a table, so that they can avoid graphing a “J” shape that they constantly allude to being wrong… But […]