Brendan Nyhan writes: I’d love to see you put some data in here that you know well and evaluate how the site handles it. The webpage in question says: Upload a data set, and the automatic statistician will attempt to describe the final column of your data in terms of the rest of the data. […]

**Miscellaneous Statistics**category.

## My talk today at the University of Michigan, 4pm at the Institute for Social Research

Generalizing from sample to population Andrew Gelman, Department of Statistics, Columbia University We’ve been hearing a lot about “data” recently, but data are generally a means to an end, with the goal being to learn about some population of interest. How do we generalize from sample to population? The process seems a bit mysterious, especially […]

## Was it really necessary to do a voting experiment on 300,000 people? Maybe 299,999 would’ve been enough? Or 299,998? Or maybe 2000?

There’s been some discussion recently about an experiment done in Montana, New Hampshire, and California, conducted by three young political science professors, in which letters were sent to 300,000 people, in order to (possibly) affect their voting behavior. It appears that the plan was to follow up after the elections and track voter turnout. (Some […]

## Solution to the sample-allocation problem

See this recent post for background. Here’s the question: You are designing an experiment where you are estimating a linear dose-response pattern with a dose that x can take on the values 1, 2, 3, and the response is continuous. Suppose that there is no systematic error and that the measurement variance is proportional to x. You […]

## Solution to the problem on the distribution of p-values

See this recent post for background. Here’s the question: It is sometimes said that the p-value is uniformly distributed if the null hypothesis is true. Give two different reasons why this statement is not in general true. The problem is with real examples, not just toy examples, so your reasons should not involve degenerate situations such as […]

## Solution to the helicopter design problem

See yesterday’s post for background. Here’s the question: In the helicopter activity, pairs of students design paper ”helicopters” and compete to create the copter that takes longest to reach the ground when dropped from a fixed height. The two parameters of the helicopter, a and b, correspond to the length of certain cuts in the […]

## Some questions from our Ph.D. statistics qualifying exam

In the in-class applied statistics qualifying exam, students had 4 hours to do 6 problems. Here were the 3 problems I submitted: In the helicopter activity, pairs of students design paper ”helicopters” and compete to create the copter that takes longest to reach the ground when dropped from a fixed height. The two parameters of the […]

## Hoe noem je?

Haynes Goddard writes: Reviewing my notes and books on categorical data analysis, the term “nominal” is widely employed to refer to variables without any natural ordering. I was a language major in UG school and knew that the etymology of nominal is the Latin word nomen (from the Online Etymological Dictionary: early 15c., “pertaining to […]

## The Fault in Our Stars: It’s even worse than they say

In our recent discussion of publication bias, a commenter link to a recent paper, “Star Wars: The Empirics Strike Back,” by Abel Brodeur, Mathias Le, Marc Sangnier, Yanos Zylberberg, who point to the notorious overrepresentation in scientific publications of p-values that are just below 0.05 (that is, just barely statistically significant at the conventional level) […]

## I didn’t say that! Part 2

Uh oh, this is getting kinda embarrassing. The Garden of Forking Paths paper, by Eric Loken and myself, just appeared in American Scientist. Here’s our manuscript version (“The garden of forking paths: Why multiple comparisons can be a problem, even when there is no ‘fishing expedition’ or ‘p-hacking’ and the research hypothesis was posited ahead […]