When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it

Posted on August 7, 2013 9:36 AM by Andrew

In response to my remarks on his online book, Think Bayes, Allen Downey wrote:

I [Downey] have a question about one of your comments:

My [Gelman’s] main criticism with both books is that they talk a lot about inference but not so much about model building or model checking (recall the three steps of Bayesian data analysis). I think it’s ok for an introductory book to focus on inference, which of course is central to the data-analytic process—but I’d like them to at least mention that Bayesian ideas arise in model building and model checking as well.

This sounds like something I agree with, and one of the things I tried to do in the book is to put modeling decisions front and center. But the word “modeling” is used in lots of ways, so I want to see if we are talking about the same thing.

For example, in many chapters, I start with a simple model of the scenario, do some analysis, then check whether the model is good enough, and iterate. Here’s the discussion of modeling in the preface:

http://www.greenteapress.com/thinkbayes/html/thinkbayes001.html#toc2

Most chapters in this book are motivated by a real-world problem, so most chapters involve some degree of modeling. Before we can apply Bayesian methods (or any other analysis), we have to make decisions about which parts of the real-world system we have to include in the model, and which details we can abstract away.

For example, in Chapter 7, the motivating problem is to predict the winner of a hockey game. I model goal-scoring as a Poisson process, which implies that a goal is equally likely at any point in the game. That is not exactly true, but it is probably a good enough model for most purposes.

And here’s an example from Chapter 12:

http://www.greenteapress.com/thinkbayes/html/thinkbayes013.html#toc103

To answer that question, we need to make some modeling decisions. I’ll start with a simplification I know is wrong; then we’ll come back and improve the model. I pretend, temporarily, that all SAT questions are equally difficult. Actually, the designers of the SAT choose questions with a range of difficulty, because that improves the ability to measure statistical differences between test-takers.

But if we choose a model where all questions are equally difficult, we can define a characteristic, p_correct, for each test-taker, which is the probability of answering any question correctly. This simplification makes it easy to compute the likelihood of a given score.

Is this the kind of model building and model checking you are talking about?

I replied:

Yes, this is the sort of model building I was talking about. But when I was talking about model checking, I was going a step further. It seems to me that what you are proposing (and I agree with this 100%) is that when you’re planning on fitting a model, you build up to it by fitting simpler models first. Even though you know the earlier models are oversimplifications, they help you understand what you’re doing. I think this is an important point that I have underemphasized in my books.

But what I’m saying is that, once you get to the serious model that you like, you then test it by using the model to make lots of predictions (within-sample as well as out-of-sample) and seeing if the predictions look like the data. I’m not talking here about error rates but rather about graphical checks to see if the model can reproduce the look of the data. See chapter 6 of Bayesian Data Analysis for many examples.

He then wrote:

I reviewed Chapter 6 of your book, and I have a good idea now what you mean by model checking. For some of these methods, I have similar examples in my book, but I didn’t pull the methods into a single chapter, as you did.

One example: in Chapter 7 I had to make some guesses about the distribution of difficulty for SAT questions. I don’t have any direct measurements of difficulty, so I use a model based on item response theory to generate simulated test scores, then compare to the actual distribution of scores.

The figure is here:

http://www.greenteapress.com/thinkbayes/html/thinkbayes013.html#toc100

The data and the simulated data agree pretty well, but the residuals are not independent. I suspect there is a better model that would capture a functional form I am missing, but I concluded that the simple model is good enough for the intended purpose.

5 thoughts on “When you’re planning on fitting a model, build up to it by fitting simpler models first. Then, once you have a model you like, check the hell out of it”

Cedric on August 7, 2013 10:34 AM at 10:34 am said:

My advisor (a solar physicist) wrote this about “What is a model?”

“A model is a theoretical construct used as thinking aid in the study of some physical system too complex to be understood by direct inferences from observed data. A model is usually designed with some specific scientific questions in mind, and asking different questions about a given physical system will, in all legitimacy, lead to distinct model designs. A well-designed model should be as complex as it needs to be to answer the questions having motivated its inception, but no more than that.”

http://solarphysics.livingreviews.org/open?pubNo=lrsp-2010-3&page=articlesu2.html

I like the model checking perspective, but I also agree with my advisor, and I’m not sure that the two are entirely compatible. 1. Wouldn’t model checking always reject my advisor’s model? 2. Assuming that I use the outcome of model checking to improve some part of my model, how do I know that modelling this part of reality will improve my accuracy in answering a particular question? Is model-checking a goal-directed endeavor (directed by what question I’m trying to answer)?
- Andrew on August 7, 2013 2:04 PM at 2:04 pm said:
  
  Cedric:
  
  Once you’ve checked a model and found problems with it, you don’t need to abandon it. You can keep using it, but it’s still good to understand its flaws.
  
  I disagree with the statement, “A well-designed model should be as complex as it needs to be to answer the questions having motivated its inception, but no more than that.” Such an approach seems limited to me. Often I learn much more from a data analysis than I ever anticipated from the motivating questions.
  - Phil on August 7, 2013 2:48 PM at 2:48 pm said:
    
    That advice is similar to Einsteins apocryphal quote (I know, if he didn’t say it how can it be his quote?) “Everything should be as simple as it can be, but no simpler.” It’s sometimes called “Einstein’s razor.”
    
    I suppose one could also say (“Price’s hair formula for scientists”?) “Make your model as complicated as it needs to be, but no more.”
    
    These are not contradictory, of course. There is some model that is optimally complicated. Use that one.
    - Nick Cox on August 7, 2013 6:06 PM at 6:06 pm said:
      
      A.N. Whitehead really did say “Seek simplicity and distrust it”.
konrad on August 7, 2013 3:49 PM at 3:49 pm said:

Andrew: I wonder if you have a comment on where model checking fits into the typical “paper cycle” in my field (probabilistic models of DNA sequence evolution):

This is a relatively mature field, which means that several models are already well established and widely used by biologists who do not have statistical/computational background (e.g. our public webserver analyzes hundreds of data sets per day). New models are typically introduced in the following way: 1) demonstrate a (biologically relevant) flaw in an existing model; 2) extend the model; 3) demonstrate that the new model fixes the flaw without performing worse on (old) checks passed by the old model.

In contrast to your description where model checking comes in at the end, the (new) model checking comes at step 1. Additional model checks at the end of the process are not required, because even the old model is already widely regarded as useful, so absolute usefulness does not have to be established. All that needs to be established is relative usefulness, i.e. demonstrating that we can outperform existing models for the purpose at hand. This is not to say that we don’t perform subsequent model checks, just that this is the point where we would typically stop and write up the results. Subsequent checks would typically be part of the next paper, as motivation for further model expansion.

Comments are closed.