Kaiser Fung on how not to critique models

In the context of a debate between economists Brad DeLong and Tyler Cowen on the “IS-LM model” [no, I don’t know what it is, either!], Kaiser writes:

Since a model is an abstraction, a simplification of reality, no model is above critique.

I [Kaiser] consider the following types of critique not deserving:

1) The critique that the modeler makes an assumption
2) The critique that the modeler makes an assumption for mathematical convenience
3) The critique that the model omits some feature
4) The critique that the model doesn’t fit one’s intuition
5) The critique that the model fails to make a specific prediction

Above all, a serious critique must include an alternative model that is provably better than the one it criticises. It is not enough to show that the alternative solves the problems being pointed out; the alternative must do so while preserving the useful aspects of the model being criticized.

I have mixed feelings about Kaiser’s rules. On one hand, I agree with his point that a model is a practical tool and that an imperfection is no reason to abandon a useful model. On the other hand, I think that much can be learned from rejection of a model, even without reference to any alternative.

Let me put it this way: That a model makes assumptions, even that a model makes wrong assumptions, is not news. If “wrong” is enough to kill, then all our models are dead on arrival anyway. But it’s good to understand the ways in which a model disagrees with the data at hand, or with other aspects of reality. As Kuhn and Lakatos knew, highlighting, isolating, and exploring anomalies are crucial steps in moving toward improvement—even if no alternative model is currently in the picture.

22 thoughts on “Kaiser Fung on how not to critique models

  1. On “assumptions:” do you have a definition of what constitutes an assumption? While this may seem obvious at first, I tried to formalize my own thinking on the subject recently and found it difficult. Is everything in a model an assumption? What about implications that derive from the assumptions? Where is the line between a model and hypothesis (in other words, if we reject a hypothesis as unlikely given our data, are we also rejecting the model)?

  2. Agreed. It always bothers me when someone says criticism of a model/hypothesis must be accompanied by an alternative model/hypothesis. One of the most famous model-rejecting experiments ever done—Michelson-Morley—couldn’t suggest an alternative because they just didn’t know enough to propose one. No one did. But I think it’s fair to say that the publication was a worthwhile one.

  3. What do you think about a requirement that the modeler at least check the assumptions of the model in some way? It seems obvious to me that a model should at least fit or predict aspects of the data that we care about, and that it is the responsibility of the modeler to show that this is the case. Yet, in some disciplines, this is a controversial statement. I have heard more than once (including from some reviewers and reviewees) that model assessment (e.g., posterior predictive checks, holdout validation, etc) is not necessary, as long as the assumptions are grounded in (economic) theory. Do you have suggestions for an effective argument against that position?

  4. I certainly have some concerns about these “rules”. I don’t see why it should be necessary to present an alternative model in order to reject a model’s validity. The hard truth is that there are some things in this world that just cannot be modelled accurately, at least with the mathematical tools currently at our disposal.

    I’ve just finished reviewing a paper involving several instances of assumptions clearly made for no other reason than mathematical convenience. I don’t see how that isn’t a valid critique. This researcher used a (legitimate) lack of empirical research to run wild with his assumptions, even making ones flying in the face of what little quantitative literature there exists on the subject. A model is a simplification of reality, but a model is only useful so far as it is still generally representative of reality. I believe if the tools exist to avoid making a simplifiying assumption, then it should be avoided.

    • It’s perfectly justifiable to make a simplifying assumption even if the mathematical or computational tools exist to avoid it. The real issue is whether that simplifying assumption breaks the validity of the model. A simplified zero dimensional model of the temperature of the earth which predicts something that is observed shouldn’t be ignored just because it doesn’t have a 3D fluid flow structure and take into account the circulation of the winds and the moon tides.

    • For example, I can complain that a lot of Bayesian models are invalid because choosing conjugate priors is an assumption based on mathemtical convenience. Does that invalidate the entire class of models?

      As for your first point, most modelers I think would not accept the assertion that there are things that cannot be modeled accurately. We’d say there are better models and worse models. If my model has inadequacies but mine is still the best among bad models out there, I should think my model is still valid.

  5. You can certainly make simplifications until your model tells you little of real interest. I find it more helpful to ask if the simplifications illuminate more than they hide.

  6. I absolutely agree that the modeler should check their own assumptions and I consider that to be a minimum requirement for even proposing a model (although I also agree with the above commentators who think the reality is different).

    Just to clarify, my post is in reaction to criticism of the IS-LM model (and things like the consensus climate models), i.e. models that have passed basic scrutiny.

    The larger point I want to make is look at your own mirror. Any of the critiques listed can be used against any model that is proposed, including your own, because all models make assumptions, make assumptions for mathematical simplicity, omits some features, etc.

    Or a different way to look at this: How does one defend one’s model against these types of critique?

    • If the criticisms are _purposeful_ and have a tendency to lead to less wrong models, they should be encouraged rather than discouraged. (I prefer less wrong to better, and least wrong to best, but few seem comfortable with that wording.)

      A couple quotes from David Cox come to mind
      “Be wary of mathematical convenience”

      “When you present empirical research, you cannot prevent someone from making a comment that will totally change your understanding of the research you are presenting.”

  7. I’m not a modeler, but I wonder about #2. I always thought modeling was using mathematics as a means to make a finding. #2 seems like turning mathematics from a means into an end.

    • If you write down some equations for a model, and find that these equations are not solvable in any way that is useful, then you haven’t “used mathematics as a means to make a finding”. So making a mathematical simplification is a means to going beyond some point where you can’t make useful progress.

      • On the other hand, sometimes people study a model just because it entails interesting mathematics, and then it isn’t telling you anything about the thing it purports to model. So that’s a different situation.

  8. Actually, neither Kuhn nor Lakatos held the (right-headed) view that “highlighting, isolating, and exploring anomalies are crucial steps in moving toward improvement”, if these terms are meant at all seriously. While their views are rather different, since Gelman puts them together, one might note that both viewed testing, replacing, models and claims due to anomalies as reconstruction games, based on convention (and for Lakatos,likely to be driven by rich enough groups to make the changes accepted as “progressive”.) For neither philosopher was improvement and progress more than something done with a wink of an eye! that’s what drove Popper mad!

  9. The test of a model is whether it gives answers that are faithful representations of reality for the questions asked of it. So a modeler ought to be free to “assume away” modeling inconveniences when there is reason to believe they won’t reduce faithfulness where it counts. Nobody complains that a toy car makes false assumptions about a real car’s size and mass; faithfulness to these attributes is sacrificed, happily, to get a toy that looks like a real car and yet can be held in the hand.

  10. I think the problem here is that “model” means two things in Economics. One is a simplified description of reality, i.e., its general scientific meaning. The second is as a framework for making a specific policy recommendation. Kaiser’s criticism is aimed at the latter function.

    The IS-LM model has been around for 50+ years. Its deficiencies are well known. Three branches of Economics (Moneterism, New Keynesians & Reall Business Cycle Theory) all exist to either improve it or try and replace it with something more sensible. Criticizing that model has led to a lot of progress in Economics (although the Balkinzation of Economics in those groups has slowed things down).

    The back story to Kaiser’s criticism is that various economistic (Krugman, DeLong, & company) have used the IS-LM model to come to the conclusion that the economy needs more stimulus. Kaiser’s point is that merely attacking the IS-LM model is not useful policy debate. A useful policy debate would provide an alternative policy and the merits of each would be discussed. Ad hominem attacks on the IS-LM model are advocating for the policy of doing nothing without coming out and saying that, or evaluating what the consequences of doing nothing might be. (I’ll give Ron Paul some props here. At least he comes out and says that his policy is for the government to do nothing and let the economy work out its problems on its own. Not great as a policy idea, this was essentially what Walter Mellon and Herbert Hoover tried at the beginning of the Great Depression, but at least he puts it out there where it can be honestly debated.)

    At any rate, I think that Kaiser’s rule make sense in terms of a policy debate. If you don’t like the model used to evaluate the policy options, you need to provide an alternative.

  11. “a serious critique must include an alternative model that is provably better than the one it criticises…”

    I don’t agree with this statement. I don’t think that a serious critique must include the proposition of an alternative model. A criticism should be sufficient if it highlights some important areas where the model fails. I feel comfortable rejecting astrology even though I don’t have an alternative (functionally equivalent) model. I also feel comfortable rejecting financial risk models which are based on the assumption of the normality of asset returns even though I may not have an alternative way to model (extreme) financial risk. All modelers make assumptions but some of this assumptions are more crucial than others.

    • “I also feel comfortable rejecting financial risk models which are based on the assumption of the normality of asset returns even though I may not have an alternative way to model (extreme) financial risk.”

      This is exactly what I’m arguing against. Do you also reject the Economics “Nobel” since the foundations of that work won the prize? Here, I like Andrew’s formulation: if “wrong” is enough to kill, then all of our models are dead on arrival anyway. In the specific case of these risk models, I think it is hard to argue that Black/Scholes/Merton didn’t make fundamental, seminal contributions that influenced everything in that field.

      “Some assumptions are more crucial than others”

      In the short blog post, I didn’t elaborate on this point. I’m saying models shouldn’t be rejected for the act of making an assumption because every model makes assumptions. This is especially true of large-scale social-science models for which we don’t know the ground truth.
      For example, Cowen said “It fudges the distinction between short-term interest rates (for the money market curve) and long-term interest rates (a determinant of investment). They’re not the same! Don’t assume they are the same, just to squash the two curves onto the same graph.” If he adds to this and tells us why this distinction is “more crucial than others”, that would be better. But then, if there is a model that doesn’t “fudge” this distinction, it is highly likely that that model then goes on to fudge other issues not fudged by the IS-LM model. So where does that leave us?

      • The Black/Scholes/Merton is not a risk management tool. It is an option pricing model. It can be useful when our aim is to estimate option prices but not useful at all when our aim is to model extreme financial risk. That is because the option prices depend on the whole distribution of returns while market risk just on its lower quantiles. And it is known that the tails of financial returns are too thick to be modeled by Gaussians. This fact is too important to be ignored. We can not adopt an assumption that is both hugely unrealistic and central to our model (i.e. main driving force behind our model’s reliability). Hence I don’t see how *the use* of Gaussian risk models can be epistemically justified. I can’t imagine of any testing procedure that would not advocate against the use of such models.

        “I’m saying models shouldn’t be rejected for the act of making an assumption because every model makes assumptions. ”

        I totally agree. A model shouldn’t be rejected for the act of making an assumption. It can however be rejected for making terrible assumptions. Assumptions that are both very influential and unrealistic.

        “If he adds to this and tells us why this distinction is “more crucial than others”, that would be better.”

        Surely.

        “But then, if there is a model that doesn’t “fudge” this distinction, it is highly likely that that model then goes on to fudge other issues not fudged by the IS-LM model.”

        I think that the main problem with IS-LM is that it assumes that peoples’ expectations (with regard to future inflation rate, future tax rate etc. ) won’t be changed by an aggressive expansionary policy. That seems unrealistic but I don’t know whether it is extremely important. See more on the following article

        http://american.com/archive/2011/october/the-end-of-comfortable-keynesianism

  12. “It is not enough to show that the alternative solves the problems being pointed out; the alternative must do so while preserving the useful aspects of the model being criticized.”

    This condition is necessary if and only if our sole objective is to prove that the alternative dominates the model being criticized *universally*. Is global dominance the only line of justification somebody could use when proposing an alternative? I don’t think so…

  13. “A useful policy debate would provide an alternative policy and the merits of each would be discussed.”

    But if a policy is indeed horrible, it’s probably a good idea to know that IS is horrible first, regardless of knowing any alternative. If you know for a fact that the policy can do more harm than the status quo, then…don’t do the policy. Doing nothing is in fact a policy, equal to that of doing something stupid, so the status quo is not at all “lethargy”.

    If Person A says to jump off a cliff, and you know it’s a bad idea, Person A should not be able to respond by saying “Why don’t you come up with something better?” and then force you off that cliff anyway.

    • But I argue that we don’t, and shouldn’t, live in this kind of absolute world you’re espousing.

      If Person A says to jump out the window, I know it’s a bad idea. I refuse to jump. Why shouldn’t Person A be able to say: “there’s a fire right outside our door. What else would you do?” and if he/she is a friend, he/she would push me out the window subito.

      I’d have thought this type of thinking comes naturally to economists. It’s the opportunity cost concept. You don’t just reject an investment opportunity absolutely, you only reject it relative to an alternative.

Comments are closed.