Jonah Sinick posted a few things on the famous speed-dating dataset and writes: The main element that I seem to have been missing is principal component analysis of the different rating types. The basic situation is that the first PC is something that people are roughly equally responsive to, while people vary a lot with […]

## Social networks spread disease—but they also spread practices that reduce disease

I recently posted on the sister blog regarding a paper by Jon Zelner, James Trostle, Jason Goldstick, William Cevallos, James House, and Joseph Eisenberg, “Social Connectedness and Disease Transmission: Social Organization, Cohesion, Village Context, and Infection Risk in Rural Ecuador.” Zelner follows up: This made me think of my favorite figure from this paper, which […]

## Statistical analysis on a dataset that consists of a population

This is an oldie but a goodie. Donna Towns writes: I am wondering if you could help me solve an ongoing debate? My colleagues and I are discussing (disagreeing) on the ability of a researcher to analyze information on a population. My colleagues are sure that a researcher is unable to perform statistical analysis on […]

## Instead of worrying about multiple hypothesis correction, just fit a hierarchical model.

Pejman Mohammadi writes: I’m concerned with a problem in multiple hypothesis correction and, despite having read your article [with Jennifer and Masanao] on not being concerned about it, I was hoping I could seek your advice. Specifically, I’m interested in multiple hypothesis testing problem in cases when the test is done with a discrete finite […]

## Why I don’t use the terms “fixed” and “random” (again)

A couple months ago we discussed this question from Sean de Hoon: In many cross-national comparative studies, mixed effects models are being used in which a number of slopes are fixed and the slopes of one or two variables of interested are allowed to vary across countries. The aim is often then to explain the […]

## They don’t fit into their categories

I was reading the newspaper today and came across an article entitled, “Black Holes Inch Ahead to Violent Cosmic Union.” No big deal, except that it was in the National section. Not the International. This reminded me of other examples where items didn’t fit into their categories, for example when Slate magazine published an article, […]

## Using y.bar to predict y: What’s that all about??

Toon Kuppens writes: After a discussion on a multilevel modeling mailing list, I came across this one-year-old blog post written by you. You might be interested to know that in social psychology, taking the aggregate outcome variable to predict the outcome variable has been used as a test of ‘convergence’, the phenomenon that people’s responses […]

## My talk tomorrow (Thurs) at MIT political science: Recent challenges and developments in Bayesian modeling and computation (from a political and social science perspective)

It’s 1pm in room E53-482. I’ll talk about the usual stuff (and some of this too, I guess).

## VB-Stan: Black-box black-box variational Bayes

Alp Kucukelbir, Rajesh Ranganath, Dave Blei, and I write: We describe an automatic variational inference method for approximating the posterior of differentiable probability models. Automatic means that the statistician only needs to define a model; the method forms a variational approximation, computes gradients using automatic differentiation and approximates expectations via Monte Carlo integration. Stochastic gradient […]

## Stan Down Under

I (Bob, not Andrew) am in Australia until April 30. I’ll be giving some Stan-related and some data annotation talks, several of which have yet to be concretely scheduled. I’ll keep this page updated with what I’ll be up to. All of the talks other than summer school will be open to the public (the […]

## Six quick tips to improve your regression modeling

It’s Appendix A of ARM: A.1. Fit many models Think of a series of models, starting with the too-simple and continuing through to the hopelessly messy. Generally it’s a good idea to start simple. Or start complex if you’d like, but prepare to quickly drop things out and move to the simpler model to help […]

## Crowdsourcing data analysis: Do soccer referees give more red cards to dark skin toned players?

Raphael Silberzahn Eric Luis Uhlmann Dan Martin Pasquale Anselmi Frederik Aust Eli Christopher Awtrey Štěpán Bahník Feng Bai Colin Bannard Evelina Bonnier Rickard Carlsson Felix Cheung Garret Christensen Russ Clay Maureen A. Craig Anna Dalla Rosa Lammertjan Dam Mathew H. Evans Ismael Flores Cervantes Nathan Fong Monica Gamez-Djokic Andreas Glenz Shauna Gordon-McKeon Tim Heaton Karin […]

## Planning my class for this semester: Thinking aloud about how to move toward active learning?

I’m teaching two classes this semester: – Design and Analysis of Sample Surveys (in the political science department, but the course has lots of statistics content); – Statistical Communication and Graphics (in the statistics department, but last time I taught it, many of the students were from other fields). I’ve taught both classes before. I […]

## Trajectories of Achievement Within Race/Ethnicity: “Catching Up” in Achievement Across Time

Just in time for Christmas, here’s some good news for kids, from Pamela Davis-Kean and Justin Jager: The achievement gap has long been the focus of educational research, policy, and intervention. The authors took a new approach to examining the achievement gap by examining achievement trajectories within each racial group. To identify these trajectories they […]

## The Use of Sampling Weights in Bayesian Hierarchical Models for Small Area Estimation

All this discussion of plagiarism is leaving a bad taste in my mouth (or, I guess I should say, a bad feeling in my fingers, given that I’m expressing all this on the keyboard) so I wanted to close off the workweek with something more interesting. I happened to come across the above-titled paper by […]

## Designing a study to see if “the 10x programmer” is a real thing

Lorin H. writes: One big question in the world of software engineering is: how much variation is there in productivity across programmers? (If you google for “10x programmer” you’ll see lots of hits). Let’s say I wanted to explore this research question with a simple study. Choose a set of participants at random from a […]

## A question about varying-intercept, varying-slope multilevel models for cross-national analysis

Sean de Hoon writes: In many cross-national comparative studies, mixed effects models are being used in which a number of slopes are fixed and the slopes of one or two variables of interested are allowed to vary across countries. The aim is often then to explain the varying slopes by referring to some country-level characteristic. […]

## Soil Scientists Seeking Super Model

I (Bob) spent last weekend at Biosphere 2, collaborating with soil carbon biogeochemists on a “super model.” Model combination and expansion The biogeochemists (three sciences in one!) have developed hundreds of competing models and the goal of the workshop was to kick off some projects on putting some of them together intos wholes that are […]

## In one of life’s horrible ironies, I wrote a paper “Why we (usually) don’t have to worry about multiple comparisons” but now I spend lots of time worrying about multiple comparisons

Exhibit A: [2012] Why we (usually) don’t have to worry about multiple comparisons. Journal of Research on Educational Effectiveness 5, 189-211. (Andrew Gelman, Jennifer Hill, and Masanao Yajima) Exhibit B: The garden of forking paths: Why multiple comparisons can be a problem, even when there is no “fishing expedition” or “p-hacking” and the research hypothesis […]

## Anova is great—if you interpret it as a way of structuring a model, not if you focus on F tests

Shravan Vasishth writes: I saw on your blog post that you listed aggregation as one of the desirable things to do. Do you agree with the following argument? I want to point out a problem with repeated measures ANOVA in talk: In a planned experiment, say a 2×2 design, when we do a repeated measures […]