Big news out of Europe, everyone’s talking about soccer. Leo Egidi updated his model and now has predictions for the Round of 16: Here’s Leo’s report, and here’s his zipfile with data and Stan code. The report contains some ugly histograms showing the predictive distributions of goals to be scored in each game. The R […]

**Statistical graphics**category.

## The NYT inadvertently demonstrates how not to make a graph

Andrew Hacker writes: I have the class prepare a report on how many households in the United States have telephones, land and cell. After studying census data, they focus on two: Connecticut and Arkansas, with respective ownerships of 98.9 percent and 94.6 percent. They are told they have to choose one of the following charts […]

## “Smaller Share of Women Ages 65 and Older Are Living Alone,” before and after age adjusment

After noticing this from a recent Pew Research report: Ben Hanowell wrote: This made me [Hanowell] think of your critique of Case and Deaton’s finding about non-Hispanic mortality. I wonder how much these results are driven by the fact that the population of adults aged 65 and older has gotten older with increasing lifespans, etc […]

## A Primer on Bayesian Multilevel Modeling using PyStan

Chris Fonnesbeck contributed our first PyStan case study (I wrote the abstract), in the form of a very nice Jupyter notebook. Daniel Lee and I had the pleasure of seeing him present it live as part of a course we were doing at Vanderbilt last week. A Primer on Bayesian Multilevel Modeling using PyStan This […]

## Who marries whom?

Elizabeth Heyman points us to this display by Adam Pearce and Dorothy Gambrell who write, “We scanned data from the U.S. Census Bureau’s 2014 American Community Survey—which covers 3.5 million households—to find out how people are pairing up.” They continue: For any selected occupation, the chart highlights the five most common occupation/relationship matchups. (For example, […]

## Ramanujan notes

A new movie on Ramanujan is coming out; mathematician Peter Woit gives it a very positive review, while film critic Anthony Lane is not so impressed. Both these reactions make sense, I guess (or so I say without having actually seen the movie myself). I’ll take this as an occasion to plug my article on […]

## All that really important statistics stuff that isn’t in the statistics textbooks

Kaiser writes: More on that work on age adjustment. I keep asking myself where is it in the Stats curriculum do we teach students this stuff? A class session focused on that analysis teaches students so much more about statistical thinking than anything we have in the textbooks. I’m not sure. This sort of analysis […]

## Birthday analysis—Friday the 13th update, and some model checking

Carl Bialik and Andrew Flowers at fivethirtyeight.com (Nate Silver’s site) ran a story following up on our birthdays example—that time series decomposition of births by day, which is on the cover of the third edition of Bayesian Data Analysis using data from 1968-1988, and which then Aki redid using a new dataset from 2000-2014. Friday […]

## Beautiful Graphs for Baseball Strike-Count Performance

This post is by Bob. I have no idea what Andrew will make of these graphs; I’ve been hoping to gather enough comments from him to code up a ggplot theme. Shravan, you can move along, there’s nothing here but baseball. Jim Albert created some great graphs for strike-count performance in a series of two […]

## Integrating graphs into your workflow

Discussion of statistical graphics typically focuses on individual graphs (for example here). But the real gain in your research comes from integrating graphs into your workflow. You want to be able to make the graphs you want, when you want them. At the same time, the graph have to be good enough that you can […]

## Somebody’s reading our research.

See footnote 10 on page 5 of this GAO report. (The above graphs are just for age 45-54, which demonstrates an important thing about statistical graphics: They should be as self-contained as possible. Otherwise when the graph is separated from its caption, it requires additional words of explanation, as you are seeing here.)

## Numbers too good to be true? Or: Thanks, Obama!?

This post is by Phil. The “Affordable Care Act” a.k.a. “Obamacare” was passed in 2010, with its various pieces coming into play over the following few years. One of those pieces is penalties for hospitals that see high readmission rates. The theory here, or at least one of the theories here, was that hospitals could […]

## Thinking about this beautiful text sentiment visualizer yields a surprising insight about statistical graphics

Lucas Estevem set up this website in d3 as his final project in our statistical communication and graphics class this spring. Copy any text into the window, push the button, and you get this clean and attractive display showing the estimated positivity or negativity of each sentence. The length of each bar is some continuously-scaled […]

## Graphical Data Analysis with R

Graphical Data Analysis with R: that’s the title of Antony Unwin’s new book. Here are the chapter titles: Ch01 Setting the Scene Ch03 Examining continuous variables Ch04 Displaying Categorial Data Ch05 Looking for Structure Ch06 Investigating Multivariate Continuous Data Ch07 Studying Multivariate Categorical Data Ch08 Getting an Overview Ch09 Graphics and Data Quality Ch10 Comparisons […]

## Job opening . . . for a data graphics editor!

Larry Wheeler writes: I’m the managing editor at Health Affairs, a monthly peer-reviewed journal about health policy. We publish a lot of statistical graphics submitted with manuscripts from academic, industry, and government researchers. We have a job opening for a new position we’re calling “data graphics editor.” I’ve been having trouble attracting the right kind […]

## I wish Napoleon Bonaparte had never been born

Not just for all the usual good reasons why the world needs fewer mass murderers, but also for the very specific reason that, had there been no Napoleon, there’d be no Napoleon-in-Russia graph, then no shining example for Ed Tufte to illustrate how people should make their graphs, then maybe graph-makers wouldn’t all feel that […]

## Where the fat people at?

Pearly Dhingra points me to this article, “The Geographic Distribution of Obesity in the US and the Potential Regional Differences in Misreporting of Obesity,” by Anh Le, Suzanne Judd, David Allison, Reena Oza-Frank, Olivia Affuso, Monika Safford, Virginia Howard, and George Howard, who write: Data from BRFSS [the behavioral risk factor surveillance system] suggest that […]

## This graph is so ugly—and you’ll never guess where it appeared

Raghu Parthasarathy writes: I know you’re sick of seeing / being pointed to awful figures, but this one is an abomination of a sort I’ve never seen before: It’s a pie chart *and* a word cloud. In an actual research paper! Messy, illegible, and generally pointless. It’s Figure 1 of this paper (in Cell — […]

## One quick tip for building trust in missing-data imputations?

Peter Liberman writes: I’m working on a paper that, in the absence of a single survey that measured the required combination of variables, analyzes data collected by separate, uncoordinated Knowledge Networks surveys in 2003. My co-author (a social psychologist who commissioned one of the surveys) and I obtained from KN unique id numbers for all […]

## If you’re using Stata and you want to do Bayes, you should be using StataStan

Robert Grant, Daniel Furr, Bob Carpenter, and I write: Stata users have access to two easy-to-use implementations of Bayesian inference: Stata’s native bayesmh function and StataStan, which calls the general Bayesian engine Stan. We compare these on two models that are important for education research: the Rasch model and the hierarchical Rasch model. Stan (as […]