I had an interesting discussion with Peter Dorman (whose work on assessing the value of a life we discussed in this space a few years ago).

The conversation started when Peter wrote me about his recent success using hierarchical modeling for risk analysis. He wrote, “Where have they [hierarchical models] been all my life? In decades of reading and periodically doing econometrics, I’ve never come across this method.”

I replied that it’s my impression that economists are trained to focus on estimating a single quantity of interest, whereas multilevel modeling is appropriate for estimating many parameters. Economists *should* care about variation, of course; indeed, variation could well be said to be at the core of economics, as without variation of some sort there would be no economic exchanges. There are good reasons for focusing on point estimation of single parameters—in particular, if it’s hard to estimate a main effect, it is typically even more difficult to estimate interactions—but if variations are important, I think it’s important to model and estimate them.

Awhile later, Peter sent me this note:

I’ve been mulling the question about economists’ obsession with average effects and posted this on EconoSpeak. I could have said much more but decided to save it for another day. In particular, while the issue of representative agents has come up in the context of macroeconomic models, I wonder how many noneconomists — and even how many economists — are aware that the same approach is used more or less universally in applied micro. The “model” portion of a typical micro paper has an optimization model for a single agent or perhaps a very small number of interacting agents, and the properties of the model are used to justify the empirical specification. This predisposes economists to look for a single effect that variations in one factor have on variations in another. But the deeper question is why these models are so appealing to economists but less attractive (yes?) to researchers in other disciplines.

I responded:

There is the so-called folk theorem which I think is typically used as a justification for modeling variation using a common model. But more generally economists seem to like their models and then give after-the-fact justification. My favorite example is modeling uncertainty aversion using a nonlinear utility function for money, in fact in many places risk aversion is _defined_ as a nonlinear utility function for money. This makes no sense on any reasonable scale (see, for example, section 5 of this little paper from 1998, but the general principle has been well-known forever, I’m sure), indeed the very concept of a utility function for money becomes, like a rainbow, impossible to see if you try to get too close to it—but economists continue to use it as their default model. This bothers me. I don’t think it’s like physicists starting by teaching mechanics with a no-friction model and then adding friction. I think it’s more like, ummm, I dunno, doing astronomy with Ptolemy’s model and epicycles. The fundamentals of the model are not approximations to something real, they’re just fictions.

Peter answered:

So my deep theory goes like this: the vision behind all of neoclassical economics post 1870 is a unified normative-positive theory. The theory of choice (positive) is at the same time a theory of social optimality. This is extremely convenient, of course. The problem, which has only grown over time, is that the assumptions needed for this convergence, the central role assigned to utility (which is where positive and normative meet) and its maximization, either devolve into tautology or are vulnerable to disconfirmation. I suspect that this is unavoidable in a theory that attempts to be logically deductive, but isn’t blessed, as physics is, by the highly ordered nature of the object of study. (Physics really does seem to obey the laws of physics, mostly.)

I’ve come to feel that utility is the original sin, so to speak. I really had to do some soul-searching when I wrote my econ textbooks, since if I said hostile things about utility no one would use them. I decided to self-censor: it’s simply not a battle that can be won on the textbook front. Rather, I’ve come to think that the way to go at it is to demonstrate that it is still possible to do normatively meaningful work without utility — to show there’s an alternative. I’m convinced that economists will not be willing to give this up as long as they think that doing so means they can’t use economics to argue for what other people should or shouldn’t do. (This also has connections to the way economists see their work in relation to other approaches to policy, but that’s still another topic.)

And I’ve been thinking more about your risk/uncertainty example. Your approach is to look for regularity in the data (observed choices) which best explains and predicts. I’m with you. But economists want a model of choice behavior based on subjective judgments of whether one is “better off”, since without this they lose the normative dimension. This is a costly constraint.

There is an interesting study to be written — maybe someone has already written it — on the response by economists to the flood of evidence for hyperbolic discounting. This has not affected the use of observed interest rates for present value calculation in applied work, and choice-theoretic (positive) arguments are still enlisted to justify the practice. Yet, to a reasonable observer, the normative model has diverged dramatically from its positive twin. This looks like an interesting case of anomaly management.

Lots to think about here (also related to this earlier discussion).

Macroeconomics in particular seems like a case study in the hazards of knowing a little statistics, just not enough. The “big idea” in macro during the last half century was the Lucas critique, which said that prevailing macroeconomic models would not generalize well to alternative policy environments, essentially because they were almost entirely extrapolated from data rather than derived from theory. To a statistician, this just sounds like the bias/variance tradeoff in action. The reaction was to swing to the other extreme, where theory is used to impose tremendous structure on models, and the data is used only to estimate a couple of parameters (even here, calibration is sometimes preferred to estimation). It is as if a macroeconomist pointed out that prevailing models were sacrificing variance completely at the alter of bias, and proposed instead to sacrifice bias completely at the alter of variance. A bit of moderation could go a long way in macro. More widespread use of nonparametric methods as well could help economists use the data to figure out how much model structure is compatible with the richness of the available data, so that as they acquire more and better data they can begin to relax the strong assumptions and let the data do more of the talking.

The Lucas critique is actually pretty different from the bias/variance tradeoff. Its point is that economic agents (banks, firms, households, etc.) don’t just react to policy outcomes, but they act in anticipation of future policy. So the effect of a 1pp decrease in inflation is going to be different under different policy regimes, and looking at historical data will not necessarily say anything about how the economy would respond to new policy choices.

This is different from the bias/variance tradeoff because it holds even without any estimation error. It’s a statement about the population quantities measured in different policy regimes. The problem also remains if we move from linear models to nonlinear or nonparametric models: it’s not a story about approximation error either, but about the change in the variables’ true relationships that’s induced by a change in anticipated future policy.

So models that come from economic theory have the advantage that, if the theory is correct (a big if), then the model should be stable across different policy regimes. No one really believes that these models are correct, but the hope is that, if they’re close to correct, than the estimated parameters should be close to stable across policy changes.

Of course, it’s possible to take the critique too far, and tools like vector autoregressions are pretty popular and use comparatively little economic theory.

All good points. I won’t debate what the Lucas critique was “really” about, but capturing expectations is only part of what motivated the shift from unstructured “Keynesian” models to highly structured microfounded models. The more important part, in my view, is the view that unstructured models generalize to new settings worse than structured models. This is true if the structure is well-specified (a big if), but macroeconomists have clearly swung too far in the other direction.

On utility functions, it’s always puzzled me that Kenneth Arrow won a Nobel Prize for proving rigorously that utility functions cannot exist (a direct consequence of his choice theorem), but economists have blithely continued to use them nonetheless.

Can you lay this out? My understanding was that a social choice function cannot exist (that fulfills some really reasonable criteria), if we are only working with the ranked preferences of the those participating in the decision. If this is the choice theorem you are referring to, I am not sure that it says too much about an individual’s utility function.

Without being exactly sure what the other Jonathan meant, the Social Choice Impossibility theorem has a strict preference ordering of states as one of its axioms. Maybe what he meant was that relaxing that axiom (thereby undoing utility theory) would undo the contradiction. I don’t think that’s where most of the action has been (I’d go with the nondictatorial axiom instead).

Nick Menzies:

As I understand it, Arrow’s theorem says nothing about individual utilities, but it proves that if you attempt to produce a social utility function that combines individual utility functions, it will suffer from Condorcet cycles, and thus cannot be single-valued.

This seems analogous to the (unreleated) empirical observation by Kahneman and Tversky that individual preferences also violate transitivity, so regarding individual utility functions, prospect theory is more relevant than Arrow.

Arrow’s theory is nonsensical in the sense that since the preferences must be reported to an outside authority for aggregation, the environment is no longer individual choice but rather participation in a group environment. In a group environment, there is a (probablistic) equilibirum (that is, a set of choices and associated probabilities of making that choice that is a weak Nash equilibrium). This is essentially the same mistake made in Austen-Smith and Banks (some economist fixed this–forget the name–have to look on web of science to get the reference and don’t have time).

Okay, that lines up with my understanding. I would think the economists would quickly give up utility theory if it had some deep theoretical defect (as opposed to poor empirical fit, and seems to be the point in this post).

Numeric: Arrow’s theorem doesn’t require reporting to an outside authority. It applies perfectly well to three friends trying to decide amongst themselves what movie to see.

The friends are making a joint decision–therefore, any procedure where one person does not decide unilaterally on the movie will require (for the sake of making an optimal decision) that each friend consider (and potentially modify his/her behavior from how he/she would behave were he/she acting as an individual actor) how the other two friends will interact with that procedure. This leads to an probabilistic equilibrium–no different that a Nash equilibrium, which is also probabilistic. It’s easy to obtain sub-optimal results (such as Arrow’s theorem) when your require the actors to behave sub-optimally (though public goods is a case where optimal behavior on the part of the actors leads to what most would consider sub-optimal results–others, such as libertarians, would not).

I look at this differently: There is no algorithm that will allow the friends to stably select the best movie to attend. As long as the friends are willing to select a movie in an informal way (as opposed to using formal rules), Arrow does not apply. Arrow’s theorem is very useful because it tells us that there are limits to what real-world problems you can solve using rules.

You’re mixing up behaviour and preferences. Nash equilibrium talks about behaviour (probabilistic or not) of individuals when interacting with others, taking their preferences as given (most of the time). Regarding preferences and individuals making choices, it is no different from “Arrow’s theory”.

Arrow did prove that the aggregation of preferences in a utility function with some desirable properties is impossible, but he did not prove the impossibility of individual utility functions.

Nick: I was inspired by your question to look back at Arrow’s theorem.

I discovered that I had misunderstood one very important aspect of the theorem. This is probably of little interest to anyone but me, but I am following up here so my comments earlier don’t mislead anyone reading this thread.

Arrow’s theorem only applies to ordinal preferences. If we use cardinality (magnitude) of preferences, then you can avoid Condorcet cycles. Thus, adding each person’s utility to produce a total social utility (welfare) might let us solve the choice problem under utilitarian ethics.

However, going back to Pareto, making different people’s utility functions commensurable (how do I measure how much happier you are than me?) is a very difficult problem, which is why Pareto and many others fall back on ordinal measures of utility instead of cardinal ones.

So Arrow’s theorem does not rule out a social welfare utility function in principle. The difficulty is a practical one: we can only measure ordinal utilities, and we would need to measure individual cardinal utilities to construct a social welfare utility function.

I don’t imagine that this comment will interest many people, but I worried that someone reading my first two comments might be misled and make the same mistake about Arrow that I had made (applying to cardinal preferences a result that only applies to ordinal ones).

You have to use cardinal utilities for decision-making under uncertainty (the probability calculus demands it). Any game (which is what 3 people making a decision on a movie is, if, say they use majority rule) has to be modeled as decision-making under uncertainty. Arrow’s theorem is inapplicable in any real-world situation because reporting (or voting) your true preference is sub-optimal in many cases (that is, with three alternative you would report your second-best alternative if in doing so you avoided your worst alternative, rather than report your preferred alternative). Somehow people have gotten it through their heads that people are optimizing, utility maximizing individuals until they are presented with a situation where they need to report their preferences, at which point they become a naive unoptimizing individual. This incoherence is massively worse than what Bayesians are always claiming frequentists do. Well, what can you say about a profession which lead us into the worst financial crisis in 80 years?

>Any game (which is what 3 people making a decision on a movie is, if, say they use majority rule) has to be modeled as decision-making under uncertainty.

This is incorrect, numeric. In the setup for Arrow, there is zero uncertainty whatsoever–everyone is rational (and everyone knows that), so complete backwards induction is possible and thus you can derive every player’s optimal strategy. This removes all uncertainty.

Furthermore, cardinal utility is only identifiable when players can express their preferences over gambles, not just because uncertainty exists. In the class of voting systems Arrow’s theorem applies to, the voters cannot express their preference over probabilistic outcomes (e.g. i prefer a 60% chance of candidate A and 40% chance of candidate B over 50-50), they can only express their preferences over pure outcomes.

Of course, when you say such things as…

>Well, what can you say about a profession which lead us into the worst financial crisis in 80 years?

…it becomes instantly clear that you don’t much care about whether the things you’re saying are right or not.

I am not Jonathan, but I have worked through this claim. Suppose an individual mind is modular, in some sense. And at least three of these modules choose. Then Arrow’s theorem applies to an individual.

Or suppose some class of commodities have at least three non-commensurable aspects, at least from one individual’s perspective. Then Arrow’s theorem applies, once again, to the individual.

If I recall correctly, literature drawing conclusions to individuals from Arrow’s theorem goes back to the 1950s.

(Hasn’t the Independence of Irrelevant Alternatives axiom been empirically falsified?)

You might be thinking of the Sonnenschein–Mantel–Debreu theorem, which basically shows modern macro is a game of “pick the right utility function”

The Wikipedia article provides a good summary.

The critical assumption is

There is no “dictator”: no single voter possesses the power to always determine the group’s preference.

This is already well recognized. In the U.S. the Supreme Court imposes a preference, or public decision making would be indeterminate.

I could see you being right about the overall theme (though ignoring within-group variation), but I don’t get the problem with risk aversion. The section in the linked article calls attention to the fact that the definition within welfare economics (i.e. marginal utility of money decreases as you have more money) departs form the lay definition. I am not sure if this is so problematic, but if so, sure.

In your post it seems you just don’t buy that it exists or is a useful concept. As far as I can tell, for it to be true we require a situation where (a) an agent has a bunch of different things on which they can spend their money (with these things differing in how much utility they produce per dollar spent), (b) they spend money on things to maximize utility, and (c) their utility is only a function of the final basket of things they have. This just defines an integer programming problem, so I think your problem is with items (b) and (c) — i.e. whether people can be usefully thought of as acting according to a utility function defined over final outcomes. In other words, do they act rationally, where rational is another technical term but probably closer to the lay definition than ‘risk aversion’.

Lots of recent research has shown where utility theory falls down as a predictive model of human behavior. But I still think it is useful to study the implications of rational behavior. For one thing, I think people generally want decision-making bodies to try and act rationally, even though they might not do so in their own lives (I realise this point could be challenged). It also provides a reference point for examining real behavior.

I would guess the major problem is the naive application of utility theory to predict actual behavior, but you seem to be going further than this.

I think I mentioned way back when these topics were mentioned, but the absolutely critical piece of of scholarship here is Milton Friedman’s The Methodolgy of Positive Ecoomics http://en.wikipedia.org/wiki/Essays_in_Positive_Economics which every economist in my era of training was taught, and it was written 25 years before I went to grad school. Whenever you come to doubt utility theory, you are confronted with Milton Friedman and his example of the expert pool player. This paper actually does little to refute your Ptolemaic analogy; indeed Friedman says: “Viewed as a body of substantive hypotheses, theory is to be judged by by its predictive power for the class of phenomena which it is intended to ‘explain.'” No better justification for Ptolemaic astronomy could be found. Later, “Truly important and significant hypotheses will have ‘assumptions’ that are wildly inaccurate descriptions of reality, and, in general, the more significant the theory, the more unrealistic the assumptions (in this sense.)” I think it is impossible to overstate the significance to economists’ cuture of this short essay.

It seems to me that the ‘better off’ issue quickly turns into something like happiness measurements and models. If not that, then … the answer is 42 [i.e. the question is religious in nature]. I don’t see a methodological way around this.

A less charitable perhaps (to statisticians) way to put it is:

Economists have very precise and fully parametric theories of how the world works. Statisticians have tests, optimizers, and regularizers.

Fernando:

I don’t think it’s accurate to say that economists have very precise and fully parametric theories of how the world works. Economists look at data and run regressions. Look at the econ papers that we’ve discussed on the blog over the past few years. I don’t think any of them have very precise and fully parametric theories of how the world works, not even close. If anything, I’d go in the opposite direction and say that economists tend to favor fitting reduced-form models whereas statisticians (when modeling; I admit I’m not talking about the hypothesis-testing branch of statistics) and psychometricians put a lot more effort into modeling the underlying process.

Not that this makes statisticians inherently better; indeed, one can make lots of good arguments in favor of reduced-form models. I just think that your description is about 180 degrees off the mark.

Andrew:

I have not read the papers you cited but my experience is very different from yours.

Most modern economics, including macro, lay a heavy emphasis on micro-foundations, representative agents, and the like. Everything is derived form first principles. By contrast what passes for theory in many other social sciences is still a narrative (though increasingly less so, and there are exceptions). The theories may be completely wrong but they are there. At the same time economics is a big tent. There are those, like Sims, who’d rather let the data speak for themselves http://www.economist.com/node/21532266.

Here is a google-found reference where you are cited that gives some background: http://noahpinionblog.blogspot.com/2012/03/why-bother-with-microfoundations.html

Fernando:

OK, here are some of the econ papers we’ve discussed recently on the blog:

Paul Gertler, James Heckman, Rodrigo Pinto, Arianna Zanolini, Christel Vermeerch, Susan Walker, Susan M. Chang, and Sally Grantham-McGregor, “Labor Market Returns to Early Childhood Stimulation: a 20-year Followup to an Experimental Intervention in Jamaica”

Ashraf and Galor, “The Out of Africa Hypothesis, Human Genetic Diversity, and Comparative Economic Development”

Yuyu Chen, Avraham Ebenstein, Michael Greenstone, and Hongbin Li, “Evidence on the impact of sustained exposure to air pollution on life expectancy from China’s Huai River policy”

Phoebe Clarke and Ian Ayres, “The Chastain effect: Using Title IX to measure the causal effect of participating in high school sports on adult women’s social lives”

David Lee, “Randomized experiments from non-random selection in U.S. House elections”

Lena Edlund and Douglas Almond, “Son-biased sex ratios in the 2000 United States Census”

Emily Oster, “Hepatitis B and the case of the missing women”

Steven Levitt, “Evaluating the effectiveness of child safety seats and seat belts in protecting children”

These papers are published in different papers and are of varying quality but all of them seem to be to essentially present reduced-form models. That’s fine—a lot of applied statistics, including some of my own most influential work, uses reduced-form models or really no underlying model at all. I just wouldn’t call any of the above papers (all of which I remember because we discussed them here on the blog) “very precise and fully parametric theories of how the world works.” Which is probably fine.

Selection bias? ;-)

Sure, maybe, but I think these are the econ papers that get lots of attention. That and macro work such as Reinhart and Rogoff which I don’t think anyone would count as “very precise and fully parametric theories of how the world works.”

Beyond this, sure, there are economists trying to derive all sorts of things from first principles but as noted above I find these principles to be a bit Ptolemaic. Mostly what I see are regressions which are justified by vague references to the first principles. Consider Levitt, for example. He has some good data and some good identification strategies but it seems like what really makes it econ rather than stat for him is that he’s following the principle that “incentives matter.” That can work ok if you take the principle to be general enough, but if you start trying to map it to utility functions you get problems.

I was about to say this. Maybe it’s just my field of econ, but most of the papers I read, and definitely the one’s I see getting cited most frequently, are making pretty heavy use of structural modeling. This is especially true of economists on the job market.

I hope it’s not surprising to Andrew that the papers discussed in popular media are not very representative of economics as an academic discipline, but it is disappointing that he doesn’t even consider the possibility before making statements like “economists tend to favor fitting reduced-form models whereas statisticians (when modeling; I admit I’m not talking about the hypothesis-testing branch of statistics) and psychometricians put a lot more effort into modeling the underlying process.”

Justin:

Recall that the whole discussion started because I was replying to economist Peter Dorman’s question about this issue. So I’m not the only one who sees this. In general, I do see multilevel modeling being less popular in economics compared to psychometrics and some other branches of statistics.

I really don’t think there’s much disagreement at all on this point, that statisticians (in particular, psychometricians) tend to be more interested in variation, while economists tend to frame their empirical research around estimating a particular parameter of interest.

In my own work I feel I’ve gotten lots of insight in many different areas by allowing parameters of interest to vary over space and time, even in settings where data are sparse and it can take a lot of modeling effort to estimate these parameters. Indeed, my very first paper published in a statistics journal used a multilevel model with only one observation per group; it required a lot of effort to come up with a regularization that allowed us to do this.

An economist might well argue (and might well be correct to argue) that it was a bad use of my time to make such a complicated data model, that I would’ve been better off constructing a theoretical model of voters with a small number of parameters, then fitting this model to the data and estimating these 3 or 4 parameters by maximum likelihood or whatever, maybe giving robust standard errors too.

The point is, that this (hypothetical) economist and I would be focusing our modeling efforts in different ways: I’m obsessed with modeling variation in a fairly atheoretical way so as to (stochastically) reproduce the data I see, whereas the economist would be more interested in coming up with a theoretically consistent model with very few parameters and going with that until something better (and also motivated by economic theory) came along.

To me, the economist’s approach is unsatisfactory because (a) it sidesteps the issue of variation, which is central to most things that I study, and (b) I’m pretty sure I won’t find the micro-model to make any kind of sense. From the other side, my approach would be unsatisfactory to the economist because it has no micro-model. And the economist won’t be particularly interested in studying variation because . . . well, that’s the topic that got Peter Dorman and me talking in the first place.

Andrew:

I like multilevel modelling and agree we should do more of it. But we need not abandon structural models in order to do this. As far as I can tell this is not an either or issue.

PS Such models are discussed in mainstream econometrics textbooks like Greene.

>An economist might well argue (and might well be correct to argue) that it was a bad use of my time to make such a complicated data model, that I would’ve been better off constructing a theoretical model of voters with a small number of parameters, then fitting this model to the data and estimating these 3 or 4 parameters by maximum likelihood or whatever, maybe giving robust standard errors too.

I’m sorry, Andrew, but this makes no sense whatsoever. The point of (perhaps more simple) parameters is so your parameters you estimate have some sort of structural economic interpretation, like an elasticity. This is important because in more recent ‘sufficient statistics’ models (for example, Chetty (2006) on optimal social insurance), you can get welfare results from a properly-specified parameter being estimated. Adding complexity to the estimation process can make it difficult to get these back. Recently I’ve been having this problem–I want to fit a multi-stage model to my data, because it makes a lot more sense and fits better, but I can’t figure out how to properly derive the elasticity I want. By the way, modeling variation here isn’t always necessary. For example, the Chetty results show (I believe) that even if your parameters have a distribution instead of a constant value, the expectation of that distribution is all you need.

Alternatively, despite your experience, the number of non-reduced-form (‘structural’) economists is quite high. I think your interactions are selected because 1. Structural work is probably harder for the news to report on, so you don’t see it (a publication like Science probably isn’t going to try to bother wading through economic theory); and 2. The economists who you have strong connections with, AFAICT, like Imbens, are fairly strongly reduced-form people. I think you’d find a lot of support about explicitly modeling the generating process from IO people, for example.

Speaking of which, I don’t know Peter Dorman, but a google search says he works on environmental econ topics. I know IO methods have a presence there, so I’m shocked by his post. Also a little dumbfounded by his comparison of ATE/LATE with macro representative agent models, which doesn’t give me a ton of faith in that he knows what he’s talking about.

Ok. Maybe your point is about the “credibility revolution” in _econometrics_ (a subfield in economics) where we get lots of “clever” identification, or just straight randomization, but little theory. If so I agree.

But Heckman, for example, is one to often start out with lots of microfoundations. He has written a lot about it. I would not say his paper above is representative.

At the same time I could point to statistician Paul Rosenbaum whom I admire but whose applied papers often lack theory. They are like: “I have these set of covariates that happen to be in the dataset, I match on those covariates, here is the effect, and here some sensitivity analysis” No assumptions about the underlying causal structure, confounders, moderators, etc..

PS I think we may be talking past each other.

If your point is that the behavioral assumptions of micro-economists are wrong, sure, though I think they are very useful. Some aspects are a priori highly likely, and that helps constrain the space of theories.

If your point is economists have no theory, and only run highly restrictive regressions due to some professional depravity, then maybe. They do have theory. But why they stick with bad theory may indeed be some group think issue.

But if your point is that micro assumptions sucks, microfoundations are just there to kid ourselves, and we might as well just model the data without theory, then I get you. But I think that is defeatist, and wrong. We might as well go back to Pearson and abolish “causation” form our lexicon.

Fernando:

I don’t think it’s “professional depravity” to think hard about identification and run reduced-form regressions. These two things are characteristic of some of the best work in statistics

andeconometrics.I don’t think that is what I said. But I am also not sure what you are arguing.

Gelman to toilet.

Confirmed.

I don’t know exactly how bears do it, but we humans indeed go to the toilet frequently. It’s something we need to do.

Bears do it in forests. Random forests.

OK, now we have a discussion at my level. I wonder what Dr. Anil Potti would think of all this.

Potti!! Oh no, not again!

I was just thinking of sending you this article, which is an economist’s take on heterogeneity in treatment effects. The author (who compares over a hundred experimental evaluations of a single energy conservation program)situates his findings of treatment heterogeneity in terms of the issues with external validity and the problem of scaling up a program nationwide, but I thought you’d like this section:

When generalizing empirical results, we either implicitly or explicitly make an assumption I call

external unconfoundedness: that there are no unobservables that moderate the treatment eﬀect and

diﬀer between sample and target. As formalized by Hotz, Imbens, and Mortimer (2005) and Hotz,

Imbens, and Klerman (2006), this type of assumption mirrors the unconfoundedness assumption

required for internal validity (Rosenbaum and Rubin 1983). When we have results from only one

site, this assumption amounts to assuming away the possibility of unexplained site-level treatment

eﬀect heterogeneity. Because this is often unrealistic, we value replication in additional sites. After

enough replications, external unconfoundedness implies only that the distribution of treatment

eﬀects in sample sites can predict the distribution of eﬀects in target sites. Put simply, if an

intervention works well in enough diﬀerent trials, we might advocate that it be scaled up. Formally,

this logic requires that sample sites are as good as randomly selected from the population of target

sites

http://www.nber.org/papers/w18373.pdf

Andrew, I like your blog and your commentary on economics but I am clueless here.

“it’s my impression that economists are trained to focus on estimating a single quantity of interest, whereas multilevel modeling is appropriate for estimating many parameters.”

Many econ studies are interested in estimating more than one parameter. The treatment effect literature tends to focus on one because they look for successful interventions but not all empirical papers are like that. Also, in OLS, even if you are interested mainly in the effect of x1, you have to get the effect of x2 right, otherwise your estimate of the effect of x1 may be biased as well.

The discussion about utility maximization is definitely worth having and economists have been having it for quite some time. It seems that we have not found a better canonical model so far but maybe we will in the future.

Maybe you could explain briefly what is the added value of using HLM. I also too find a bit odd that economists do not use HLM more. My explanation is that HLM requires some distributional assumptions while, because of asymptotics, OLS doesn’t, if the sample size is large enough. Angrist and Pischke are even against using probit and logit which have been used extensively in economics before. However, economists who use more structural models (e.g. industrial organization people) are fine with using stronger distributional assumptions because that allows them to run counter-factual simulations of policy interventions. But they do not use HLM either, probably (just speculating here) because it does not easily lend itself to an economic interpretation.

Tom:

The point of fitting hierarchical models (linear or otherwise) is to get more finely tuned estimates, something which is clearly of interest in political science (consider, for example, this paper from 1990 with King where we estimate the incumbency advantage for each year rather than pooling 40 years of data to get a single estimate, which is the sort of thing that people do sometimes, or consider this recent paper with Ghitza where we estimate age, period, and cohort effects).

It’s always been natural to me to try to estimate parameters as locally as possible which is one reason, I think, it’s taken me so long to understand where econometricians are coming from much of the time.

Regarding assumptions, I think a hierarchical model in which a parameter can vary is more general than a non-hierarchical model where the parameter is the same in all years or all groups or whatever. Sure, you can interpret the latter regression as an estimate of some sort of average treatment effect but then the questions arise: (a) what average are you exactly estimating and (b) why should we care about the average when the ultimate question of interest is the effect itself.

Finally, I am not happy with sample size justifications of simple statistical models. When sample size is small, the argument is that it’s too hard to fit anything much more complicated than least squares. When sample size is large, the argument is that the asymptotics imply that you can’t do any better than least squares. I don’t think either of these statements is in general true. If variation is important, I’d like to study it even if sample size is small. And if sample size is large, I take this as an opportunity to study things on a finer scale, enough so that my data become sparse and a multilevel model will help me out.

In economics a lot of models are assumed (and often derived) with structural parameters. There is no point in estimating them as changing in time or in sub-populations because to do so would contradict the whole derivation. For instance, in the example of risk aversion it is universally assumed that this does not change in time. Whether you like the assumption or not is a different thing (I don’t and I even have a paper showing it is not). There is a big literature on modelling variation of parameters, especially in financial econometrics (e.g. gasmodel.com ); the issue is when you have not enough observations to actually do that, or you have to deal with endogeneity.

I think the first part of what you said is also Andrew’s point .. some economists make strong assumptions about things like risk aversion (ad many other things) not changing/being fixed. This is a set of assumptions that is implicitly part of the theory. He’s just being a bit hyperbolic. But I do wonder if this way of thinking is where some of the issues around experimental design /randomized block effects that Andrew posted about a few weeks ago come from.

You cannot just make everything time-varying. If anything because you don’t have enough observations to be able to estimate everything that way. In most of Macro you are lucky if you have 200 observations. Does it make sense to think that people born in 1960 are endowed with a different risk aversion on average than people born in 2000? Most will tell you no.

And I think it’s highly likely that people socialized in such different decades will probably not have the same risk aversion on average. The difference will be even bigger if you compare people from different centuries …

Or at different points in their life cycle (hence you will often hear that adolescents fear almost nothing).

Zmk:

You write, “You cannot just make everything time-varying. If anything because you don’t have enough observations to be able to estimate everything that way.”

No! That’s the whole point of multilevel modeling, to estimate parameters that vary by using partial pooling. It’s something I do all the time. Which brings us back to the question from Peter Dorman that got all this discussion started: how is it that so many economists are unaware of multilevel models or don’t understand what they can do?

Regarding your question, ” Does it make sense to think that people born in 1960 are endowed with a different risk aversion on average than people born in 2000?,” I don’t think it makes much sense at all to think of people as having a “risk aversion” parameters, indeed for reasons alluded to in the above post I think the phrase “risk aversion” has led to no end of trouble within economics.

One thing is to _model_ data variability quite another to explain it. Personally I’m interested in the causes, as opposed to the correlates, of the variability. For that you need theory as well as multilevel models. http://ann.sagepub.com/content/628/1/132.abstract

Fernando:

I have no problem with using multilevel models in a causal context using appropriate theory. Indeed, this was one of the points I kept trying to make in one of my interminable blog discussions with Pearl, that he should not think of statistical modeling as a competitor to his theory; rather, he should welcome the idea of partial pooling as a way to apply his approach to realistic sparse-data situations. (And, remember, just about every problem is sparse-data if you’re interested in studying variation.)

Yes. We agree. Stable effects come at the price of sparsity. Partial pooling give us the best possible tradeoff in this regard. We get more for less, and who does not love a bargain!

Fernando:

Who does not love a bargain, indeed? Nicholas Wade might say that some races of people are genetically more likely than others to love a bargain. It has to do with selection pressure and, um, alleles. Sorry to get all politically incorrect on you but this is science, dude.

Andrew:

What are situations in which multilevel models are *not* suitable or effective? What are some heuristics / signs to suspect other techniques might be a better fit than multilevel models?

Multilevel (Hierarchical) Modeling: What It Can and Cannot Do

Andrew: I am aware of them. The question is whether they are the right tools for the job. But, likewise, there are many neat things happening in econometrics that statisticians and ‘big data’ guys are also not aware of. It’s the price of having the discipline split into 3 fields.

I suppose it depends how you measure it, but yes it probably does make sense to think that it is possible people have changed even within the US, if you look at the survey data you might find some stunning differences just like you find for many things between 1960 and 2000.

But you’re missing the larger point I think. Saying you don’t feel you have the data to model something and are thus making a simplifying assumption is one thing. Saying that fixedness is a fundamental assumption of your theory is another. If it is just a data problem, then you look for data or data analysis solutions (as Andrew said).If something is a universal be prepared to defend it both as theory and empirically. I’m sure you could back up “most peole will tell you” with a list of citations in another kind of context, so it’s fine, stand up for saying you believe in strong theoretical assumptions.

I do think it is changing, like I said I even have a paper showing evidence that it is not structural under the basic utility functions. The feedback you get when you propose that parameters like risk-aversion are changing in time is that you are just using a wrong utility function, and there are many utility functions, e.g. that account for things like stickiness of consumption and/or try to disentangle risk-aversion and intertemporal substitution. But the bottom line is that you want to use methods that are consistent with your model and could work with your data; with risk-aversion:

* people from 1960 are still in sample in 2000 if there is birth-year effect it is impossible to identify

* models are estimated from Euler equations with conditional expectations, allowing changing risk-aversion is quite complicated, the way models are derived assumes structural parameter

OK, thanks Andrew. I think this is a helpful conversation, let me add a few points:

1) I agree that economists tend to be interested in estimating the parameters and look at the s.e. of the estimates only for statistical significance. Would it somehow accommodate your concerns if in econ. papers there was more discussion of the confidence intervals and of the uncertainty associated with the estimates? Obviously this does not model the variation, as in the HLM, but at least it is something.

2) Economists sometimes are interested in the second moment of a variable but they model it explicitly as a dependent variable. I am thinking of e.g. the literature on wage instability. This gets somehow to issue of within-group (actually within individual, in this case) variation.

3) If variation across groups is important, the economist tends to add interactions to the model to capture this. But if there are too many interactions to add, sometimes this is not done. Here, I agree, HLM would come handy.

4) I am fine with what you say about OLS but you have to recognize that, yes, HLM buys you more, but it also has stronger distributional assumptions. By the way, I do not know how important a violation of the assumptions is: I guess someone has run some Monte-Carlo simulations about it and so maybe the issue is not a big deal in the end. But to me that something that needs to be demonstrated, otherwise the point stands.

5) I agree with what others have said. Economists derive (formally or in a narrative way) a relationship between y and x. Sometimes they even specify y=f(x) but often the implication of the theory is that when x goes up, y e.g. goes up. Then the issue becomes by how much y goes up (if it goes up at all!) and so off to estimation we go. Obviously we could also develop hypotheses about how the effect of x varies. And sometimes we do: there are many papers where the effect of x is hypothesized to vary across different groups and so we interact x with the group dummies (see point above).

But we do not always take this route. You argue above that economists spend too much time theorizing and not enough letting the data speak. I actually agree with you (try publishing descriptive statistics if you are not a big shot) but you have to recognize that the type of exercise is different. And eventually you have to take a stance on what causal mechanism you have in mind and specify the model underlying it. Economists think that this model should be consistent with some form of constrained maximization. This is a much looser requirements than it looks as there are many different ways to specify an economic model. But without a more specific model the more descriptive approach can only buy you so much. And the risk is that people are going to make normative and positive recommendations based on the more descriptive approach anyway and then it becomes harder to examine whether these recommendations are appropriate or not.

(ok, it got a bit too long but here it is)

By the way, “HLM” in my posts should really be “HM” as not all hierarchical models are linear, as you correctly point out.

As several commenters have noted, most of the more elegant and professionally prestigious empirical work in economics is highly model driven. This is typically achieved with very simple models. Economists refer to these approvingly as “parsimonious”, which is a strange term for an extremely exacting set of assumptions that act, in the context of empirical models, as identifying restrictions. Models are kept simple both by focusing on the mean case (easily justified if you believe that representative agents are the relevant unit of analysis) and by considering only one departure at a time from the standard competitive model, as if the real economic world were centered on a that model with the isolated little wobbles. There may be a particular problem in the credit market or a particular problem in the labor market, say, but not both at once. A problem with this method is that the world departs substantially from the standard competitive model in multiple ways at the same time – Banerjee, Duflo and Munshi (2003) consider implications of this for empirical work, in a paper discussed by Brad DeLong at the URL below. It is in this one-departure-at-a-time method, rather than any particular element of the standard theory itself, that the strong group-think of the economics discipline lies.

Certainly, as you say, hierarchical modelling is more general as a depiction of relationships in the data. For many economists this is not a plus, because acknowledging that a variation is systematic invites further causal attribution, which often necessitates abandonment of the one-departure-at-a-time approach; the limbo of theoretical indeterminacy soon follows – we might as well be sociologists.

(see http://delong.typepad.com/sdj/2006/12/macrointernatio.html)

It would be great, if we all were sociologists with decent skills in modeling and without completely unrealistic assumptions trying to actually model the real world and behavior instead of either playing around with assumptions and theories we know are mostly wrong and ignore all kind of cultural and social factors (economics) or having completely disjoined approaches in non formalized theory and empirical research (sociology).

Why is everyone ignoring work like

WILLER, DAVID, ERIC GLADSTONE, und NICK BERIGAN. „Social Values and Social Structure“. The Journal of Mathematical Sociology 37, Nr. 2 (2013): 113–30. doi:10.1080/0022250X.2011.629067.

Paywall-free: https://www.academia.edu/4310837/Social_Values_and_Social_Structure

? Sociology needs more formalized and clearly worked out theories and Econonmics needs to stop sealing itself off. Instead the disciplines are shouting at each other.

Economics* …

AG: Economists should care about variation, of course; indeed, variation could well be said to be at the core of economics, as without variation of some sort there would be no economic exchanges.

Variation is also at the core of evolution by natural selection. It’s at the core of animal learning. Gregory Bateson wrote a book about how all the three systems (evolution, learning, economics) are analogous (Mind and Nature).

AG: There is the so-called folk theorem which I think is typically used as a justification for modeling variation using a common model. But more generally economists seem to like their models and then give after-the-fact justification. My favorite example is modeling uncertainty aversion using a nonlinear utility function for money, in fact in many places risk aversion is _defined_ as a nonlinear utility function for money. This makes no sense on any reasonable scale (see, for example, section 5 of this little paper from 1998, but the general principle has been well-known forever, I’m sure), indeed the very concept of a utility function for money becomes, like a rainbow, impossible to see if you try to get too close to it—but economists continue to use it as their default model.

Paragraphs like this are the reason AG is my favorite academic, by a landslide.

AG: see, for example, section 5 of this little paper from 1998, but the general principle has been well-known forever, I’m sure.

I am sure it has NOT been well-known forever. It’s only been known for 26 years and no one really understands it yet.

I’m pretty sure the Swedish philosopher who proved the mathematical phenomenon 10 years before you and 12 years before Matt Rabin was the first to identify it. The Hansson (1988)/Gelman (1998)/Rabin(2000) paradox is up there with Ellsberg (1961), Samuelson (1963) and Allais (1953).

http://en.wikipedia.org/wiki/Sven_Ove_Hansson

Sorry to have arrived late, but I’ve been interested in catching up on this discussion. One thing I’d like to emphasize is that the exchange I had with Andrew drifted a bit, so there is more than one issue on the table. It began with the predilection of economists for measuring group-average effects rather than centering on the variation in effects, with groups in practice being constructed at a rather aggregative level. The main observation here was that applied micro theories (the theory part of the empirical paper) tend to be representative agent models, just as they do on the macro side. The reason that’s important to say is that the macro use has received a lot of flak in recent years, with many commentators concluding that micro is the healthier branch. Maybe, but in this one respect there is not much difference, is there? And diversity of response is just as important in most micro questions as it is in macro.

Then we got into a discussion of utility theory itself. I was struck by the strong desire of most economists to retain it (rather than try to adapt a more psychologically sophisticated alternative approach), which I (conjecturally) attributed to the desire to have theories perform double duty, positive and normative. Utility is central to this: it is the building block of models that are empirically tested (positive) and also the basis for policy advocacy (Pareto type criteria). Economists want to say, (a) I have a theory that explains the data and (b) it shows that policy x is preferred. (Another conjecture: it could be, in the end, that this fusion of positive and normative analysis is the differentia specifica of economics vis-a-vis the other social sciences – what’s specific to the way economists study a problem that is also studied by practitioners of other disciplines. For instance, sociologists and anthropologists often invoke incentives, but they have no normative content in the positive-normative sense.)

The question of structural models came up in the comments, which illustrates one reason I like the community around this blog. Yes, this is the latest and greatest over the last 10 years or so. Fields in economics where young researchers are making a big impact have gone in this direction. My sense is that these models displace but do not mitigate the homogenizing tendency of representative agent models. For instance, in the child labor literature I’m familiar with, the shift to a structural approach has given us a more complex picture of the interaction of multiple influences on the proclivity of child to become child laborers (school performance in both senses, household income and wealth, shocks and credit constraints, etc.), and this is an advance over earlier work. On the other hand, there is a fundamental sameness in the underlying (latent) mechanism that is generating these effects, as given by the representative utility-maximizing model. The most important lacuna, from a practical point of view, is cultural variation: different groups, even different households, differ in the extent to which decisions like child labor are the result of economic considerations; there are also cultural norms and expectations at work. At the policy level this is very big.

Anyway, many thanks to Andrew and also to the thoughtful and constructively-minded commentariat.

Naive question: What exactly is the flood of evidence for hyperbolic discounting? And isn’t the fact that billions of people effectively concede in using regular exponential interest in everyday transactions (banks, loans, etc.) evidence contrary to hyperbolic discounting?

Should one trust revealed preferences in a an artificial lab / survey setting to those exhibited in actual transactions?

Could the reason be that economists are far more likely than many other disciplines to be asked to communicate their results to lay outsiders? Often in forms very succinct, perhaps simplistic?

It is easy to cast a point-estimate into a TV soundbite but hard to communicate an entire model of variation?

There also seems to be an additional layer of selection bias here. If Andrew and Peter Dorman say that HM modeling is more widespread in other Social Sciences than in Economics I’m wondering whom Andrew is looking at. Because most Sociologists and Political Scientists, even the empirically oriented ones would not even consider using something as complex as a hierarchical model for data analysis. It seems to me like Andrew is comparing the “methodological elite” of the other Social Sciences with the overall work in Economics where far more researchers at least use robust models, have solid . I don’t have more than my own impressions to back this up, though, hence I might be mislead myself. Growing up between Sociologists and Political Scientists makes it odd for me to view things like Andrew does. The canonical model in empirical Sociology and Political Science seem to linear regression or generalized linear regression models if the dependent variable does not allow for OLS. The parameters of interest usually are all the average effects of the “independent variables” and – of course – their p values. I’d be glad if more Sociologists and Political Scientists would actually only estimate *one* average effect and consider the other variables just as covariates or something like that.

One cannot specify a meaningful statistical test without an underling causal theory.

[…] I spoke about some of the differences I have noticed. Coincidentally, Andrew Gelman blogged about the same issue a day […]

Yikes! I’m not sure it makes sense to join a food fight, but why not. Some thoughts:

1. Economics is a diverse field, you can find lots of things to dislike if you look hard enough. I know I do!

2. Macroeconomics is still working out some very basic things. Even such basic questions as “where do business cycles come from?” and “how does monetary policy work” generate legitimate debate.

3. There’s a huge range of styles of data work, which is probably a good thing. At one end of the spectrum, there is work trying to summarize the properties of data in some useful way. Hodrick and Prescott, Sims, and Stock and Watson all do that in different ways. At the other end, people try to construct simple models that reproduce some of the more obvious facts. It’s harder than it looks and there’s a range of opinion on where we are and where we should be heading.

4. In this context, quibbles about utility functions seem to me to miss the point. As in other fields, there are people in economics working on things like this, but it’s not obvious they’re an important source of the problems we face. Nevertheless, it’s not hard to find work in macroeconomics on hyperbolic discounting, ambiguity (when you don’t know the probabilities), information processing issues, and lots of other things.

5. This is more difficult with dynamics — and macroeconomics is about dynamics. In any dynamic situation, people are going to care not only about the current situation but about what possibilities might arise in the future. That makes all of the standard identification problems worse, because what we expect the future to bring is a high-dimensional hidden variable. Data alone won’t solve this problem. What most people do is put extra structure on the problem to see if that helps.

6. Surely we could learn things from other fields, statistics included. I don’t think you’ll find it easy, but maybe it’ll be fun.

Dave:

I completely agree that research and practice in econometrics is diverse. Statistics is diverse. That said, I don’t think it’s unreasonable for Peter and me to be discussing general differences in attitudes between the two fields, even while recognizing that there is a lot of variation within each field.

That seems like a deliciously ironic comment to me, given the core topic of using average / point effects versus modelling variation.

Stylised fact or situated messiness? The diverse effects of increasing debt on national economic growth

Citation

Bell, A, Johnston, R & Jones, K 2014, ‘Stylised fact or situated messiness? The diverse effects of increasing debt on national economic growth’. Journal of Economic Geography.

Abstract

This paper reanalyses data used by Reinhart and Rogoff (2010c – RR), and later Herndon et al. (2013) to consider the relationship between growth and debt in developed countries. The consistency over countries and the causal direction of RR’s so called ‘stylised fact’ is considered. Using multilevel models, we find that when the effect of debt on growth is allowed to vary, and linear time trends are fully controlled for, the average effect of debt on growth disappears, whilst country-specific debt relations vary significantly. Additionally, countries with high debt levels debt appear more volatile in their growth rates. Regarding causality, we develop a new method extending distributed lag models to multilevel situations. These models suggest the causal direction is predominantly growth-to-debt, and is consistent (with some exceptions) across countries. We argue that RR’s findings are too simplistic, with limited policy relevance, whilst demonstrating how multilevel models can explicate realistically complex scenarios.

> tend to be more interested in variation, while economists tend to frame their empirical research around estimating a particular

> parameter of interest

In the meta-analysis of randomised clinical trials the methodological response surface modelling concerns (using Rubin’s 1989 terminology) really got in the way of underlying true response surface modelling (was conflated) not to mention the observational nature of the information.

So the wishful thinking was that using an average that accounted for the variation in its uncertainty would be less wrong than ignoring the variation or taking it as real and reasonably well estimated.

Peter’s post on the 19th is a nice statement of why economists persist in using consumer utility theory in a canonical manner, both as a basis for specification of the arguments in estimating equations, and as a basis on which to make statements about welfare improving allocations and reallocations of resources as a result of public policy. The short of the story is that the intuition of this is quite simple, but often unrecognized, as in some of the posts under this thread.

Doing the best one can under the circumstances is the intuition behind the formal statement of maximizing utility subject to a constraint. The constraint can be anything scarce: time, money, availability of life partners, distance, etc. I have yet to find anyone who will state that when making a decision they do not try to make the best choice they can under the circumstances they confront a decision time. And this intuition has an easy formulation in mathematics using Lagrangian multipliers.

For more on this, see my and Andrew’s posts on March 7 and 8, 2012 in his post “Some economists are skeptical about microfoundations” on 6 March 2012.

Some say that there may be too much hype in hyperbolic discounting (http://arielrubinstein.tau.ac.il/papers/67.pdf). All we know is that simple models of decision making are false.

The question is whether they are useful (in some specific cases, and accompanied by appropriate caveats).

I think Peter Dorman’s conjecture is right: the acceptance of standard utility theory is probably mostly due to the desire to have theories perform double duty, positive and normative”. Possibly that’s because many economists’ job is (or is perceived to be) to give policy advice even when they really do not have enough information to do so properly.

Sometimes the problem is not even “not enough information”, but too little bandwidth in communication with the potential users of economic analysis (e.g., the public). For example, pretend that an economist’s model could give accurate prediction in terms “if policy X, then outcome Y” where the outcome Y is a detailed description of how much each type of agent will get of each good at each point in time. But the journalist will ask: “should we do X?” And the wretched economist will answer.

Peter Dorman’s assertion that microeconomics typically assumes representative agent models is mistaken.

In macroeconomics, it is often assumed that the aggregated variables behave as if they result from the choices of one “representative” agent—that’s generally only literally true under extraordinarily restrictive assumptions, and that’s a fundamental problem faced by DSGE and similar modeling strategies. But this problem does not arise in contexts in which we observe and model individual-level outcomes, which includes almost all of applied microeconometrics (the bulk of economic research).

To formalize a little using the canonical example: we observe consumption bundles x_i, incomes y_i, and prices p_i, where i indexes people. Assuming that the realized consumption bundles resulted from maximizing utility functions u_i(x_i) is NOT making a representative agent assumption, even if we assume all those functions have the same parametric form, or even if we assume they’re all the same function. Let X be the sum (or mean) of all the choices and let Y be the sum (or mean) of all incomes. Assuming a representative agent means modeling X as a result of maximizing utility subject to income Y.

Economic theorists explored this issue in depth about forty years ago, concluding that even if all individual choices result from utility maximization, it does not follow that the aggregates behave as if they too are the results of utility maximization (that is, even if all the u_i exist and generate the x_i, it does not follow that some U() exists that generates X). But this issue _only_ arises when considering aggregates. It does not arise with individual-level data, which is one reason much of that literature from the 1970s now seems somewhat quaint: the massive increase in the availability of individual-level data, along with increases in computation and econometric methods, means we don’t need to worry so much about modeling aggregates.

On the other issue: should econometricians use more HLMs and similar strategies? Not in my opinion. Statisticians and some other -cians are often more interested in what amounts to descriptive statistics. The objects of inference in many HLMs and similar estimation strategies would be considered nuisance parameters in most applied microeconometrics, and correctly ignored to the extent possible (for example, cluster-correcting a covariance matrix estimate in lieu of estimating some detailed parametric form for the error structure). Variation in the causal parameters of interest is commonly explicitly modeled, in contrast to claims above, following in particular the vast literature on heterogeneous treatment effects.

Chris:

You write that economists are often interested in heterogeneous treatment effects. That’s great. In that case, the point of multilevel (hierarchical) models is to estimate that variation, using sparse data. If you’re only “allowed” to use least squares, maximum likelihood, etc., then you’ll be forced to do lots of pooling of data in order to get stable estimates. With multilevel modeling you can get more finely-grained estimates. This will be the case whether your goals are descriptive inference, or causal inference, or both.

Finally, you write, “Economic theorists explored this issue in depth about forty years ago, concluding that even if all individual choices result from utility maximization, it does not follow that the aggregates behave as if they too are the results of utility maximization.” But my point is that “utility” at an individual level is nothing but a model, and a very crude, behavioristic, pre-cognitive model at that.

Andrew, I find your reply baffling. How is least-squares or maximum likelihood incompatible with multilevel modeling? Why am I “allowed” to use only these estimation methods? In what way does modeling variation in which I’m not interested allow me more “fine-grained” estimates of effects in which I am interested?

Note too that econometricians often do include what in other literatures would be called “multilevel” effects–say, (econometric version of the term) fixed effects and a host of interactions for classrooms, schools, and states in a model of student achievement. But as I noted that variation would typically be viewed as a nuisance to deal with rather than the estimand of interest.

To some extent this is just differences in jargon; the “random coefficients” model common in econometrics also has an interpretation as a “multilevel” model, for example. It is my casual observation that what econometricians refer to as “random effects” models, which are also “multilevel” models, are much less common in econometric research than in other disciplines, but that’s a result of the focus in econometrics on causal modeling.

RE: utility: Peter Dorman’s point, which you quoted at length, about utility is conceptually confused. If the point you wanted to make was the utility is just a model and models are not reality, I’m not at all sure who you believe you’re disagreeing with. But dismissing all of utility theory on that basis is clearly inappropriate, as models are by definition just models!

Utility theory, or more generally, the notion that choices can be modeled as resulting from some optimization process, is extremely useful and flexible—for example, hyperbolic discounters are also maximizers!—and we have no superior alternative. I am not sure what your point is beyond “your model is not reality!,” which does not seem particularly interesting.

Chris:

Sorry for any confusion. Of course I don’t think you’re only “allowed” to use least squares etc. I was responding to your statement that economists should not use more HLMs. The linear part of “HLM” doesn’t really interest me so I was interpreting this as a statement that economists should not use more hierarchical models. I’m assuming that, instead of hierarchical models, you’d prefer least squares, maximum likelihood, etc.

My point is that, causal or otherwise, if you’re interested in estimating variation, multilevel models allow you to do this in the presence of sparse data. In your most recent comment, you say that “variation would typically be viewed as a nuisance to deal with rather than the estimand of interest” but in your earlier comment you referred to “the vast literature on heterogeneous treatment effects.” So I’m assuming that economists

areoften interested in estimating variation (hence that vast literature), and my point is that multilevel modeling can allow you to estimate that variation.Of course if you’re not interested in studying variation, that’s another story. It has to depend on the application. Again, I was responding to your statement about “the vast literature on heterogeneous treatment effects.” It sounds like whoever is writing that literature is interested in variation, and these are the people who I think could benefit from multilevel modeling.

Finally, my point is not that utility is just a model; my point is that certain aspects of utility modeling (in particular, the lamentably standard practice of defining risk aversion as expected-utility behavior with a nonlinear utility function for money) is a model

that makes no sense. In other settings, though, I think utility is an excellent model. Indeed, I have a whole chapter in our Bayesian Data Analysis with applications of expected-utility decision making. I think utility modeling can be great, but I think that certain applications of it make no sense at all.Andrew, we are talking past each other. Your post quotes Peter Dorman’s views on microeconomics and (particularly if one follows the link) applied econometrics at length, apparently approvingly. Dorman’s claims about both economic theory and econometric practice are, in view, very much mistaken. Are you or are you not defending his claims?

Econometricians are typically estimating causal effects, very much including how those causal effects vary across units. Econometricians are typically much less interested in documenting variation which is nuisance *relative to the causal question at hand*. To further my example, suppose I am interested in estimating the effect of class size on student achievement. I might consider a model like,

y_{ijkt} = b_{ijkt}S_{ijkt} + X_{ijkt}\beta + u_{ijkt}

where y is a measure of achievement, S is class size, X are controls, and u represents unmeasured causes of y. i indexes students, j classrooms, k schools, and t time.

In this context, I might be keenly interested in how the effect of interest — b_{ijkt} — varies across time, space, and the characteristics of students and schools. I could use heterogeneous treatment effects models to study that variation. Those models can also be interpreted as hierarchical, this is, again, to some extent a difference in jargon rather than approaches. However, I would typically be uninterested in detailed modeling of the disturbance u_{ijkt}, that is nuisance variation. I may still handle handle it parametrically with a variety of (econometric jargon) fixed and random effects, but it’s not the object of inference.

Consider a classic (almost 20 year old) paper on exactly this topic, Angrist and Lavy (1997) (http://www.nber.org/papers/w5888.pdf). There is an error structure similar to what I describe above, treated as nuisance parameters, as I noted. There is also an extensive focus on how the causal effect of interest varies across subpopulations:

“The results of this application show that reductions in class size induce a significant increase in reading and math scores for 5th graders and a smaller increase in reading scores for 4th graders. In contrast, there is little evidence of any association between class size and the test scores of 3rd graders, although this finding may result from problems with the 1992 wave of the testing program. The estimates also suggest that the gains from small classes are largest for students from disadvantaged backgrounds.”

Dorman’s claims are simply mistaken.

RE: utility: I don’t know how to make sense of your comments. Risk aversion is valuing a lottery at less than its expected value, which in the context of expected utility maximization is equivalent to assuming a concave utility function. But even in this extremely narrow application “utility theory” is a far more diverse and less restrictive concept: what of models of non-expected utility, non-recursive utility, random utility, ambiguity aversion, and so on?

You appear to actually have a problem with certain very narrow applications of the theory, not “utility theory” itself, although utility theory itself seems to be the target of your post and comments.

Chris:

1. In your example, multilevel modeling can allow you to estimate that b function in settings of sparse data, for example allowing you to estimate how b varies geographically or over time, in settings where you don’t have a lot of data on each geographic or time unit.

2. That’s right, I have no objection to utility theory itself, indeed I have a chapter in my book that’s full of applications of utility theory. I do have an objection to the modeling of risk aversion using a nonlinear utility function for money. This objection is explained in my 1998 paper and it’s pretty straightforward. You could call this a narrow application and maybe it is, but my problem here is clear, and the problem is that, instead of simply viewing utility theory as a useful model that has its limits, many economists (including you, perhaps) seem to feel the need for utility theory to explain

alleconomic behavior. In contrast, I think utility theory is great in certain areas but I think there are other cases (for example, everyday risk aversion) where utility theory doesn’t apply.To repeat from my previous comment, I think utility modeling can be great, but I think that certain applications of it make no sense at all. Utility theory is a wonderful model in that it is useful in many settings, which is probably more than can be said for most models! I think utility theory is particularly useful if we recognize it’s limitations and don’t try to make it do things it can’t.

1. You’re again overlooking the point I keep making: econometricians commonly DO use what would be called “multilevel modeling” in other disciplines. Peter Dorman’s claims are mistaken.

2. Again, there is lots of economic theory modeling risk aversion which *is* utility theory but does not suffer from the point you make in that paper and does not define risk-aversion as curvature in a single-index function. Are you familiar with this literature? If so, perhaps you address it.

I don’t know what it means to say “utility theory explains all economic behavior” so I cannot say whether I agree or disagree with that claim. I do think that modeling behavior as goal-directed is extremely useful in a wide variety of contexts, and that constrained maximization is an extremely useful and flexible approach to modeling goal-directed behavior. Perhaps you could list more examples of behavior which you believe to be completely resistant to this approach to demonstrate how mistaken my belief is?

Chris:

1. In one of your comments, you wrote, that economists should not use “more HLMs and similar strategies,” and in another comment you wrote, “In what way does modeling variation in which I’m not interested allow me more ‘fine-grained’ estimates of effects in which I am interested?”

My comment above is a reply in that I discuss how economists can use multilevel modeling to get more fine-grained estimates of effects in which they

areinterested.2. I think that utility theory is excellent and has many important areas of applications. In addition I agree with you that modeling behavior as goal-directed is extremely useful in a wide variety of contexts. I haven’t thought much about constrained maximization but your statement that “constrained maximization is an extremely useful and flexible approach to modeling goal-directed behavior” seems reasonable as well. So I don’t think I need to demonstrate how mistaken your belief is, since we seem to be pretty much in agreement on this point, that these ideas are extremely useful and flexible.

Utility theory happens to fail miserably as a model for everyday risk aversion. That’s ok. Utility theory is an excellent model but there is no reason to expect it to be appropriate for all economic behavior.

Andrew,

1. I again point out that econometricians do commonly use what in other literatures would be called “multilevel models.” Other literatures are often interested in data reduction rather than causal inference. As you well know, assumptions common in hierarchical models which are appropriate for data reduction are commonly inappropriate for causal inference. For example, in this paper (http://www.stat.columbia.edu/~gelman/surveys.course/Gelman2006.pdf) you discuss to a common orthogonality assumption in hierarchical models, refer the reader to a standard *econometrics* textbook for a discussion of “this sort of correlation in multilevel models,” and caution the reader that some of the estimated effects “these effects cannot necessarily be interpreted causally for observational data, even if these data are a random sample from the population of interest.” Correct me if I’m mistaken, but I believe that the sort of model you have in mind for estimating “fine grained” effects rely on distributional assumptions typically eschewed in econometrics, and that this class of model is better for data reduction than for estimating causal effects. I am specifically referring to distributional and orthogonality restrictions in these sorts of models which are fine for descriptive, but not causal, analysis when I note econometricians ought not borrow more of these methods.

2. Your post accuses economists of ignoring reality, economics is “like doing astronomy with Ptolemy’s model and epicycles. The fundamentals of the model are not approximations to something real, they’re just fictions.” Not long ago Judea Pearl said similar on this blog about statisticians, leading you to insist on “a bit of politeness and a bit of respect to people who spend their careers studying reality and just happen not to use your favored methods.” Yet a few weeks later you exhibit an equivalent lack of politeness and respect for people who spend their careers studying reality and just happen not to use your favored methods. The particular issue here is not even vaguely statistical, it’s microeconomic theory.

Again, “utility theory” is not limited to 18th century variants of expected utility theory. Problems such as you bring up with the basic expected utility model of risk under uncertainty are, of course, well known. In contrast to your allegation, a large literature has arisen in response studying models which, as I noted, are utility models but which do not suffer from the problems you bring up (A now rather old review of some of this vast literature can be found in Starmer (2000), http://static.luiss.it/hey/ambiguity/papers/Starmer_2000.pdf). Your claim that “utility theory fails miserably as a model for everyday risk aversion” is wrong. You mean “the standard expected utility model fails miserably in certain contexts for everyday risk aversion.” There are many models using utility theory which do not exhibit this failure.

Chris:

1. I find this discussion difficult because you seem to be arguing in two different directions. From one side, you write, “econometricians do commonly use what in other literatures would be called ‘multilevel models.” If so, that’s great. But you later write that “econometricians ought not borrow more of these methods,” that these models bother you because they make “distributional and orthogonality restrictions.” In some sense this is a matter of taste: I am interested in modeling variation (in causal settings and otherwise) and I am willing to try out (and check) some models. To me, the gain I get from being able to estimate things (causal and otherwise) that vary by year and by geographic location, etc., is worth the effort I have to put in to fit and check a model. To you, the gain from being able to estimate things that vary by year and by geographic location, etc. is not worth the effort and the assumptions. I recognize that empirical economists have done a lot of important work, and maybe the choice to aggregate rather than to use multilevel models is a good choice. I think much could be gained by using multilevel models in economics (and it sounds like others agree with me, given your statement that “econometricians do commonly use what in other literatures would be called ‘multilevel models”) and I’m putting that idea out there.

2. I have a lot of respect for models of reality. I don’t have so much respect for models that don’t make sense. As noted, I like utility theory a lot but I’m not so impressed when it is used to do things it can’t do. I read the Starmer paper you linked to and I agree with the following statement from that paper: “EUT is implausible as a general account of behavior under risk.” “Implausible as a general account” is more polite than “fails miserably” but I think it’s the same general idea: This is a model that doesn’t work. Yes, this is well known but nonetheless in many standard treatments, a nonlinear utility function for money is taken as the first-line model for uncertainty aversion. The message, while well-known by many, seems to be not so well known by many others. See, for example, the Wikipedia entry on risk aversion, which goes on and on using the long-refuted nonlinear-utility-for-money model and then has one little paragraph on why that model might be wrong. I find this sort of thing exhausting.

Finally, I don’t think I’m being impolite. We’re all doing our best and are trying to model reality. But I think it’s important to realize the limitations of one’s model.

[…] Zanimivo branje – dva prispevka na temo razlik med statistiko in ekonometrijo: Rob Hyndman in Andrew Gelman. […]

Andrew, I am sure you are aware — for example, because I just cited you noting! — that econometricians commonly use models which model variation in causal effects, such as “that vary by year and by geographic location, etc.” I do not see much point in continuing this discussion until you acknowledge that fact in this thread. Your assumption that I personally ignore that sort of variation because “it’s not worth the effort” is mistaken, for example, I’m currently working on a paper which uses what econometricians refer to as “heterogeneous treatment effects” model and you would refer to as a “multilevel model” specifically to estimate whether a particular causal effect changed over time and how it varies with certain individual-level characteristics. I am not, on the other hand, and unlike many approaches in other disciplines, interested in the temporal or geographic structure of the error term; I handle that in a parsimonious fashion, and don’t report anything about that structure in the paper.

On point 2, you continue to talk as if “utility theory” and “canonical expected utility theory applied to financial problems” are equivalent phrases, but they are not. Yet again, there are many “utility theories” in this context which are not subject to your complaint. Whether the basic expected utility model “works” depends on the topic of investigation. In some contexts it is appropriate despite its limitations, in others, it is not—it’s “refuted” in much the same sense as, say, any static model is “refuted” by the existence of time. As evidenced the decades of literature in leading journals on exactly this topic, economists are not unaware of these limitations (maybe Wikipedia editors are, I don’t know or care). Not only is your claim mistaken, it is in fact impolite of you to use it as an excuse to liken economics to Ptolmic pseudo-science. Why do you think your comments are less impolite than Pearl’s? They’re almost the same comments!

Incidentally, your paper makes erroneous claims. You correctly demonstrate that canonical expected utility theory can generate bad predictions on some scales. You then claim without any basis that I can see that “[the mistake] is that fearing uncertainty is not necessarily the same as “risk aversion” in utility theory” (how do you know that’s “the” “mistake”?) Then things go badly wrong: “the latter can be expressed as a concave utility function for money, whereas the former implies behavior that is not consistent with any utility function.” That is just wrong, you yet again incorrectly equate—this time, sadly, in a published paper rather than a blog post—the canonical Arrow-Pratt expected utility model with *all possible* models using utility functions. There are in fact many utility functions which can handle exactly the sort of almost-paradox you describe, and which capture both “fearing uncertainty” and “risk aversion” as distinct concepts (the former is known as `ambiguity aversion’). These models are typically generalizations of the canonical expected utility model, but they are still utility models.

Chris:

I acknowledge your statement that econometricians commonly use models which model variation in causal effects, such as “that vary by year and by geographic location, etc. I also acknowledge your statement that “econometricians ought not borrow more of these methods.” I agree with you that in economics (and also in political science) there’s been an increasing interest in multilevel models in recent years. I’d just like this increase to go faster! Ideally then we’d be seeing less of the sorts of misconceptions shown by commenter Zmk above who wrote, “You cannot just make everything time-varying. If anything because you don’t have enough observations to be able to estimate everything that way.” The whole point of multilevel models is to be able to estimate (and express uncertainty) in what you want without being so concerned about this sort of sample size restriction. I’ve read lots of econ papers and seen lots of econ talks and my impression is that it’s standard practice to do lots of pooling of data in order to obtain statistical significance. I hope that in the future there’s more use of multilevel models instead. I agree with you that this is already happening but I’m impatient and would like to see more of it.

2. You might not know or care about what’s on Wikipedia but it’s my impression that Wikipedia (a) is very influential and (b) reflects a consensus of belief. So for the entry on risk aversion I’m concerned on both counts. On the particular example from my paper on teaching, of course any behavior can be modeled as utility in a circular manner (a person with uncertainty aversion has a negative utility for uncertainty, etc.) but such a tautological explanation makes the theory useless. Again, let me quote the paper you pointed to: “EUT is implausible as a general account of behavior under risk.” I find it frustrating that you accept this statement when it appears in an economics journal but not when it is stated (in a more informal manner) in a journal of statistics teaching.

Finally, I don’t think Ptolemaic work was “pseudo-science,” it was real science—for its time! Similarly, utility theory was real science—in particular, real psychophysics—for the 1940s. And it remains a useful structure in many ways. But I don’t see it as serious science for describing human psychology in the wake of the cognitive revolution of the 1950s.

Chris:

Also, let me thank you for putting in the effort to write and post these comments. I think this sort of discussion can be very helpful. Even in a situation such as this where neither of seems to be able to understand what the other is saying, this can be useful in helping us realize where there are problems of communication.

[…] via @Anne__Lavigne “Differences between econometrics and statistics” https://andrewgelman.com/… […]

[…] couple months ago in a discussion of differences between econometrics and statistics, I alluded to the well-known fact that everyday uncertainty aversion can’t be explained by a […]

One thing missing from this discussion is the so-called “asset integration hypothesis” which says that what people care about (or should care about) is not the utility of the lottery outcomes, but the impact of the lottery outcomes on total wealth.

Suppose your wealth is $100,000 and you are considering a lottery where you get $10 with probabilty 1/2 and lose $5 with probabilty 1/2. The “right” way to view this is to regard the lottery as having outcomes $100,010 and $99,995. One would have to be extraordinarily risk averse to reject that lottery. So for small gambles of the sort we see in lab experiments, one would expect to see expected value driving choices.

I grant you that ordinary people don’t necessarily behave this way, but it is a good starting point for thinking about choice under uncertainty (and doesn’t require any of the expected utility assumptions).