In response to my remarks on his online book, Think Bayes, Allen Downey wrote:
I [Downey] have a question about one of your comments:
My [Gelman's] main criticism with both books is that they talk a lot about inference but not so much about model building or model checking (recall the three steps of Bayesian data analysis). I think it’s ok for an introductory book to focus on inference, which of course is central to the data-analytic process—but I’d like them to at least mention that Bayesian ideas arise in model building and model checking as well.
This sounds like something I agree with, and one of the things I tried to do in the book is to put modeling decisions front and center. But the word “modeling” is used in lots of ways, so I want to see if we are talking about the same thing.
For example, in many chapters, I start with a simple model of the scenario, do some analysis, then check whether the model is good enough, and iterate. Here’s the discussion of modeling in the preface:
Most chapters in this book are motivated by a real-world problem, so most chapters involve some degree of modeling. Before we can apply Bayesian methods (or any other analysis), we have to make decisions about which parts of the real-world system we have to include in the model, and which details we can abstract away.
For example, in Chapter 7, the motivating problem is to predict the winner of a hockey game. I model goal-scoring as a Poisson process, which implies that a goal is equally likely at any point in the game. That is not exactly true, but it is probably a good enough model for most purposes.
And here’s an example from Chapter 12:
To answer that question, we need to make some modeling decisions. I’ll start with a simplification I know is wrong; then we’ll come back and improve the model. I pretend, temporarily, that all SAT questions are equally difficult. Actually, the designers of the SAT choose questions with a range of difficulty, because that improves the ability to measure statistical differences between test-takers.
But if we choose a model where all questions are equally difficult, we can define a characteristic, p_correct, for each test-taker, which is the probability of answering any question correctly. This simplification makes it easy to compute the likelihood of a given score.
Is this the kind of model building and model checking you are talking about?
Yes, this is the sort of model building I was talking about. But when I was talking about model checking, I was going a step further. It seems to me that what you are proposing (and I agree with this 100%) is that when you’re planning on fitting a model, you build up to it by fitting simpler models first. Even though you know the earlier models are oversimplifications, they help you understand what you’re doing. I think this is an important point that I have underemphasized in my books.
But what I’m saying is that, once you get to the serious model that you like, you then test it by using the model to make lots of predictions (within-sample as well as out-of-sample) and seeing if the predictions look like the data. I’m not talking here about error rates but rather about graphical checks to see if the model can reproduce the look of the data. See chapter 6 of Bayesian Data Analysis for many examples.
He then wrote:
I reviewed Chapter 6 of your book, and I have a good idea now what you mean by model checking. For some of these methods, I have similar examples in my book, but I didn’t pull the methods into a single chapter, as you did.
One example: in Chapter 7 I had to make some guesses about the distribution of difficulty for SAT questions. I don’t have any direct measurements of difficulty, so I use a model based on item response theory to generate simulated test scores, then compare to the actual distribution of scores.
The figure is here:
The data and the simulated data agree pretty well, but the residuals are not independent. I suspect there is a better model that would capture a functional form I am missing, but I concluded that the simple model is good enough for the intended purpose.