Archive of posts filed under the Statistical graphics category.

## Perhaps you could try a big scatterplot with one dot per dataset?

Joe Nadeau writes: We are studying variation in both means and variances in metabolic conditions. We have access to nearly 200 datasets that involve a range of metabolic traits and vary in sample size, mean effects, and variance. Some traits differ in mean but not variance, others in variance but not mean, still others in […]

## “Fudged statistics on the Iraq War death toll are still circulating today”

Mike Spagat shares this story entitled, “Fudged statistics on the Iraq War death toll are still circulating today,” which discusses problems with a paper published in a scientific journal in 2006, and errors that a reporter inadvertently included in a recent news article. Spagat writes: The Lancet could argue that if [Washington Post reporter Philip] […]

## How to graph a function of 4 variables using a grid

This came up in response to a student’s question. I wrote that, in general, you can plot a function y(x) on a simple graph. You can plot y(x,x2) by plotting y vs x and then having several lines showing different values of x2 (for example, x2=0, x2=0.5, x2=1, x2=1.5, x2=2, etc). You can plot y(x,x2,x3,x4) […]

## Don’t get fooled by observational correlations

Gabriel Power writes: Here’s something a little different: clever classrooms, according to which physical characteristics of classrooms cause greater learning. And the effects are large! Moving from the worst to the best design implies a gain of 67% of one year’s worth of learning! Aside from the dubiously large effect size, it looks like the […]

## Against Arianism 2: Arianism Grande

“There’s the part you’ve braced yourself against, and then there’s the other part” – The Mountain Goats My favourite genre of movie is Nicole Kidman in a questionable wig. (Part of the sub-genre founded by Sarah Paulson, who is the patron saint of obvious wigs.) And last night I was in the same room* as […]

## Who spends how much, and on what?

Nathan Yau (link from Dan Hirschman) constructed the above excellent visualization of data from the Consumer Expenditure Survey. Lots of interesting things here. The one thing that surprises me is that people (or maybe it’s households) making more than $200,000 only spent an average of$160,000. I guess the difference is taxes, savings (but not […]

## What’s gonna happen in the 2018 midterm elections?

Following up on yesterday’s post on party balancing, here’s a new article from Joe Bafumi, Bob Erikson, and Chris Wlezien giving their predictions for November: We forecast party control of the US House of Representatives after the 2018 midterm election. First, we model the expected national vote relying on available generic Congressional polls and the […]

## Awesome MCMC animation site by Chi Feng! On Github!

Sean Talts and Bob Carpenter pointed us to this awesome MCMC animation site by Chi Feng. For instance, here’s NUTS on a banana-shaped density. This is indeed super-cool, and maybe there’s a way to connect these with Stan/ShinyStan/Bayesplot so as to automatically make movies of Stan model fits. This would be great, both to help […]

## Should the points in this scatterplot be binned?

Someone writes: Care to comment on this paper‘s Figure 4? I found it a bit misleading to do scatter plots after averaging over multiple individuals. Most scatter plots could be “improved” this way to make things look much cleaner than they are. People are already advertising the paper using this figure. The article, Genetic analysis […]

## Opportunity for Comment!

(This is Dan) Last September, Jonah, Aki, Michael, Andrew and I wrote a paper on the role of visualization in the Bayesian workflow.  This paper is going to be published as a discussion paper in the Journal of the Royal Statistical Society Series A and the associated read paper meeting (where we present the paper and […]

## “Choose the data visualization that best serves your audience.”

Tian Zheng prepared the above slide which very clearly displays an important point about statistical communication. The maps are squished to be too narrow, and the scatterplot has too many numbers on the axes (better to have income in thousands and percentages in tens), also given the numbers it seems that the data must be […]

## Awesome data visualization tool for brain research

When I was visiting the University of Washington the other day, Ariel Rokem showed me this cool data visualization and exploration tool produced by Jason Yeatman, Adam Richie-Halford, Josh Smith, and himself. The above image gives a sense of the dashboard but the real thing is much more impressive because it’s interactive. You can rotate […]

## The current state of the Stan ecosystem in R

(This post is by Jonah) Last week I posted here about the release of version 2.0.0 of the loo R package, but there have been a few other recent releases and updates worth mentioning. At the end of the post I also include some general thoughts on R package development with Stan and the growing number of […]

## Taking perspective on perspective taking

Gabor Simonovits writes: I thought you might be interested in this paper with Gabor Kezdi of U Michigan and Peter Kardos of Bloomfield College, about an online intervention reducing anti-Roma prejudice and far-right voting in Hungary through a role-playing game. The paper is similar to some existing social psychology studies on perspective taking but we […]

## “The problem of infra-marginality in outcome tests for discrimination”

Camelia Simoiu, Sam Corbett-Davies, and Sharad Goel write: Outcome tests are a popular method for detecting bias in lending, hiring, and policing decisions. These tests operate by comparing the success rate of decisions across groups. For example, if loans made to minority applicants are observed to be repaid more often than loans made to whites, […]

## Wanna know what happened in 2016? We got a ton of graphs for you.

The paper’s called Voting patterns in 2016: Exploration using multilevel regression and poststratification (MRP) on pre-election polls, it’s by Rob Trangucci, Imad Ali, Doug Rivers, and myself, and here’s the abstract: We analyzed 2012 and 2016 YouGov pre-election polls in order to understand how different population groups voted in the 2012 and 2016 elections. We […]

## Here’s the title of my talk at the New York R conference, 20 Apr 2018:

The intersection of Graphics and Bayes, a slice of the Venn diagram that’s a lot more crowded than you might realize And here are some relevant papers: [2003] A Bayesian formulation of exploratory data analysis and goodness-of-fit testing. {\em International Statistical Review} {\bf 71}, 369–382. (Andrew Gelman) [2004] Exploratory data analysis for complex models (with […]

## Hey—here’s the title of my talk for this year’s New York R conference

Toward a Fuller Integration of Graphics in Statistical Analysis The talk will be 20 Apr 2018 at 1:25pm. And here are some things to read ahead of time, if you’re interested: [2003] A Bayesian formulation of exploratory data analysis and goodness-of-fit testing. {\em International Statistical Review} {\bf 71}, 369–382. [2004] Exploratory data analysis for complex […]

## Interactive visualizations of sampling and GP regression

You really don’t want to miss Chi Feng‘s absolutely wonderful interactive demos. (1) Markov chain Monte Carlo sampling I believe this is exactly what Andrew was asking for a few Stan meetings ago: Chi Feng’s Interactive MCMC Sampling Visualizer This tool lets you explore a range of sampling algorithms including random-walk Metropolis, Hamiltonian Monte Carlo, […]

## How to improve this visualization of voting in the U.S. Congress?

Richie Lionell points us to this interactive visualization of votes of U.S. Senators. It’s attractive. My big problem is that nothing is conveyed by the positions of the points along the circles. Thus, that cute image of the points moving around is a bit misleading. Maybe someone has a suggestion of how to do this […]