Skip to content
Archive of posts filed under the Teaching category.

There’s nothing embarrassing about self-citation

Someone sent me an email writing that one of my papers “has an embarrassing amount of self-citation.” I’m sorry that this person is embarrassed on my behalf. I’m not embarrassed at all. If I wrote something in the past that’s relevant, it makes sense to cite it rather than repeating myself, no? A citation is […]

Learn by experimenting!

A students wrote in one of his homework assignments: Sidenote: I know some people say you’re not supposed to use loops in R, but I’ve never been totally sure why this is (a speed thing?). My first computer language was Java, so my inclination is to think in loops before using some of the other […]

Mitzi’s talk on spatial models in Ann Arbor, Thursday 5 April 2018

Mitzi returns to her alma mater to give a talk at joint meeting of the Ann Arbor useR and ASA Meetups: Spatial models in Stan Abstract This case study shows how to efficiently encode and compute an intrinsic conditional autoregressive (ICAR) model in Stan. When data has a neighborhood structure, ICAR models provide spatial smoothing […]

Classical hypothesis testing is really really hard

This one surprised me. I included the following question in an exam: In causal inference, it is often important to study varying treatment effects: for example, a treatment could be more effective for men than for women, or for healthy than for unhealthy patients. Suppose a study is designed to have 80% power to detect […]

What to teach in a statistics course for journalists?

Pascal Biber writes: I am a science journalist for Swiss public television and have previously regularly covered the “crisis in science” on Swiss public radio, including things like p-hacking, relative risks, confidence intervals, reproducibility etc. I have been giving courses in basic statistics and how to read scientific studies for Swiss journalists without science backgrounds. […]

“revision-female-named-hurricanes-are-most-likely-not-deadlier-than-male-hurricanes”

Gary Smith sends along this news article from Jason Samenow, weather editor of the Washington Post, who writes: Three years ago, a scientific study claimed that storms named Debby are predisposed to kill more people than storms named Don. The study alleged that people don’t take female-named storms as seriously. Numerous analyses have since found […]

New Stan case studies: NNGP and Lotka-Volterra

It’s only January and we already have two new case studies up on the Stan site. Two new case studies Lu Zhang of UCLA contributed a case study on nearest neighbor Gaussian processes. Bob Carpenter (that’s me!) of Columbia Uni contributed one on Lotka-Volterra population dynamics. Mitzi Morris of Columbia Uni has been updating her […]

“The following needs to be an immutable law of journalism: when someone with no track record comes into a field claiming to be able to do a job many times better for a fraction of the cost, the burden of proof needs to shift quickly and decisively onto the one making the claim. The reporter simply has to assume the claim is false until substantial evidence is presented to the contrary.”

Mark Palko writes: The following needs to be an immutable law of journalism: when someone with no track record comes into a field claiming to be able to do a job many times better for a fraction of the cost, the burden of proof needs to shift quickly and decisively onto the one making the […]

Alzheimer’s Mouse research on the Orient Express

Paul Alper sends along an article from Joy Victory at Health News Review, shooting down a bunch of newspaper headlines (“Extra virgin olive oil staves off Alzheimer’s, preserves memory, new study shows” from USA Today, the only marginally better “Can extra-virgin olive oil preserve memory and prevent Alzheimer’s?” from the Atlanta Journal-Constitution, and the better […]

Forking paths plus lack of theory = No reason to believe any of this.

[image of a cat with a fork] Kevin Lewis points us to this paper which begins: We use a regression discontinuity design to estimate the causal effect of election to political office on natural lifespan. In contrast to previous findings of shortened lifespan among US presidents and other heads of state, we find that US […]

Stupid-ass statisticians don’t know what a goddam confidence interval is

From page 20 in a well-known applied statistics textbook: The hypothesis of whether a parameter is positive is directly assessed via its confidence interval. If both ends of the 95% confidence interval exceed zero, then we are at least 95% sure (under the assumptions of the model) that the parameter is positive. Huh? Who says […]

Interactive visualizations of sampling and GP regression

You really don’t want to miss Chi Feng‘s absolutely wonderful interactive demos. (1) Markov chain Monte Carlo sampling I believe this is exactly what Andrew was asking for a few Stan meetings ago: Chi Feng’s Interactive MCMC Sampling Visualizer This tool lets you explore a range of sampling algorithms including random-walk Metropolis, Hamiltonian Monte Carlo, […]

Stan is a probabilistic programming language

See here: Stan: A Probabilistic Programming Language. Journal of Statistical Software. (Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, Allen Riddell) And here: Stan is Turing Complete. So what? (Bob Carpenter) And, the pre-stan version: Fully Bayesian computing. (Jouni Kerman and Andrew Gelman) Apparently […]

Tips when conveying your research to policymakers and the news media

Following up on a conversation regarding publicizing scientific research, Jim Savage wrote: Here’s a report that we produced a few years ago on prioritising potential policy levers to address the structural budget deficit in Australia. In the report we hid all the statistical analysis, aiming at an audience that would feel comfortable reading a broadsheet […]

My talk tomorrow (Fri) 10am at Columbia

I’m speaking for the statistics undergraduates tomorrow (Fri 17 Nov) 10am in room 312 Mathematics Bldg. I’m not quite sure what I’ll talk about: maybe I’ll do again my talk on statistics and sports, maybe I’ll speak on the statistical crisis in science. Anyone can come; especially we’d like to attract undergraduates—not just statistics majors—to […]

Looking for data on speed and traffic accidents—and other examples of data that can be fit by nonlinear models

[cat picture] For the chapter in Regression and Other Stories that includes nonlinear regression, I’d like a couple homework problems where the kids have to construct and fit models to real data. So I need some examples. We already have the success of golf putts as a function of distance from the hole, and I’d […]

Advice for science writers!

I spoke today at a meeting of science journalists, in a session organized by Betsy Mason, also featuring Kristin Sainani, Christie Aschwanden, and Tom Siegfried. My talk was on statistical paradoxes of science and science journalism, and I mentioned the Ted Talk paradox, Who watches the watchmen, the Eureka bias, the “What does not kill […]

My favorite definition of statistical significance

From my 2009 paper with Weakliem: Throughout, we use the term statistically significant in the conventional way, to mean that an estimate is at least two standard errors away from some “null hypothesis” or prespecified value that would indicate no effect present. An estimate is statistically insignificant if the observed value could reasonably be explained […]

Why I think the top batting average will be higher than .311: Over-pooling of point predictions in Bayesian inference

In a post from 22 May 2017 entitled, “Who is Going to Win the Batting Crown?”, Jim Albert writes: At this point in the season, folks are interested in extreme stats and want to predict final season measures. On the morning of Saturday May 20, here are the leading batting averages: Justin Turner .379 Ryan […]

Stan case studies

Following up on recent posts here and here, I thought I’d post a list of all the Stan case studies we have so far. 2017: Modeling Loss Curves in Insurance with RStan, by Mick Cooney Splines in Stan, by Milad Kharratzadeh Spatial Models in Stan: Intrinsic Auto-Regressive Models for Areal Data, by Mitzi Morris The […]