Skip to content
Archive of posts filed under the Statistical graphics category.

2 on chess

Is it really “often easier to win a rematch than to defend a championship”? The quoted bit above comes from Tyler Cowen, writing about the Anand/Carlsen world championship rematch. I’m still not used to the idea of a new world championship match every year but I guess why not? Anyway, here’s my question. Tyler Cowen […]

Try a spaghetti plot

Joe Simmons writes: I asked MTurk NFL fans to consider an NFL game in which the favorite was expected to beat the underdog by 7 points in a full-length game. I elicited their beliefs about sample size in a few different ways (materials .pdf; data .xls). Some were asked to give the probability that the better […]

Statistical Communication and Graphics Manifesto

Statistical communication includes graphing data and fitted models, programming, writing for specialized and general audiences, lecturing, working with students, and combining words and pictures in different ways. The common theme of all these interactions is that we need to consider our statistical tools in the context of our goals. Communication is not just about conveying […]

My course on Statistical Communication and Graphics

We will study and practice many different aspects of statistical communication, including graphing data and fitted models, programming in Rrrrrrrr, writing for specialized and general audiences, lecturing, working with students and colleagues, and combining words and pictures in different ways. You learn by writing an entry in your statistics diary every day. You learn by […]

What does CNN have in common with Carmen Reinhart, Kenneth Rogoff, and Richard Tol: They all made foolish, embarrassing errors that would never have happened had they been using R Markdown

Rachel Cunliffe shares this delight: Had the CNN team used an integrated statistical analysis and display system such as R Markdown, nobody would’ve needed to type in the numbers by hand, and the above embarrassment never would’ve occurred. And CNN should be embarrassed about this: it’s much worse than a simple typo, as it indicates […]

What do you do to visualize uncertainty?

Howard Wainer writes: What do you do to visualize uncertainty? Do you only use static methods (e.g. error bounds)? Or do you also make use of dynamic means (e.g. have the display vary over time proportional to the error, so you don’t know exactly where the top of the bar is, since it moves while […]

They know my email but they don’t know me

This came (unsolicited) in the inbox today (actually, two months ago; we’re on a delay, as you’re probably aware), subject line “From PWC – animations of CEO opinions for 2014″: Good afternoon, I wanted to see if the data my colleague David sent to you was of any interest. I have attached here additional animated […]

mysterious shiny things

(Disclaimer: I’m new to Shiny, and blog posts, but I know something about geography.)  In the Shiny gallery, take a look at 2001 versus 2002. Something funny happens to Switzerland (and other European countries), in terms of the legend, it moves from Europe to the Middle East. Also, the legend color scheme switches.     […]

One of the worst infographics ever, but people don’t care?

This post is by Phil Price. Perhaps prompted by the ALS Ice Bucket Challenge, this infographic has been making the rounds: I think this is one of the worst I have ever seen. I don’t know where it came from, so I can’t give credit/blame where it’s due. Let’s put aside the numbers themselves – […]

My courses this fall at Columbia

Stat 6103, Bayesian Data Analysis, TuTh 1-2:30 in room 428 Pupin Hall: We’ll be going through the book, section by section. Follow the link to see slides and lecture notes from when I taught this course a couple years ago. This course has a serious workload: each week we have three homework problems, one theoretical, […]

NFL players keep getting bigger and bigger

Aleks points us to this beautiful dynamic graph by Noah Veltman showing the heights and weights of NFL players over time. The color is pretty but I think I’d prefer something simpler, just one dot per player (with some jittering to handle the discrete reporting of heights and weights). In any case, it’s a great […]

Stan World Cup update

The other day I fit a simple model to estimate team abilities from World Cup outcomes. I fit the model to the signed square roots of the score differentials, using the square root on the theory that when the game is less close, it becomes more variable. 0. Background As you might recall, the estimated […]

Stan goes to the World Cup

I thought it would be fun to fit a simple model in Stan to estimate the abilities of the teams in the World Cup, then I could post everything here on the blog, the whole story of the analysis from beginning to end, showing the results of spending a couple hours on a data analysis. […]

Visualizing sampling error and dynamic graphics

Robert Grant writes: What do you think of this visualisation from the NYT [in an article by Neil Irwin and Kevin Quealy but I’m not sure if they’re the designers of the visualization]? I’m pretty impressed as a method of showing sampling error to a general audience! I agree. P.S. In related news, Antony Unwin […]

Avoiding false parallelism in a graph

“False parallelism”—feel free to come up with a better term here—is when a graph has repeating elements that do not correspond to repeating structure in the underlying topic being graphed. An example appears in the above graphs from Dan Kahan. The content of the graphs is fine (and, more generally, I think he’s making an […]

What’s the algorithm, Kenneth?

I can’t figure out what’s the deal with the bars for Corners. The bar labeled “7” is much less than 7 times the bar labeled “1.” At first I was guessing that maybe they’re not counting the numbered part in the bar width (which would be a pretty weird choice) but that wouldn’t work for […]

Average predictive comparisons in R: David Chudzicki writes a package!

Here it is: An R Package for Understanding Arbitrary Complex Models As complex models become widely used, it’s more important than ever to have ways of understanding them. Even when a model is built primarily for prediction (rather than primarily as an aid to understanding), we still need to know what it’s telling us. For […]

Can we make better graphs of global temperature history?

Chris Gittins sends along this post by Gavin Schmidt, who writes: Some editors at Wikipedia have made an attempt to produce a complete record for the Phanerozoic: But these collations are imperfect in many ways. On the last figure the time axis is a rather confusing mix of linear segments and logarithmic scaling, there is […]

Small multiples of lineplots > maps (ok, not always, but yes in this case)

Kaiser Fung shares this graph from Ritchie King: Kaiser writes: What they did right: – Did not put the data on a map – Ordered the countries by the most recent data point rather than alphabetically – Scale labels are found only on outer edge of the chart area, rather than one set per panel […]

Understanding Simpson’s paradox using a graph

Joshua Vogelstein pointed me to this post by Michael Nielsen on how to teach Simpson’s paradox. I don’t know if Nielsen (and others) are aware that people have developed some snappy graphical methods for displaying Simpson’s paradox (and, more generally, aggregation issues). We do some this in our Red State Blue State book, but before […]