## What to learn in your statistics Ph.D. program?

Cosma Shalizi (of the CMU statistics dept) and I had an exchange about the role of measure theory in the statistics Ph.D. program. I have to admit I’m not quite sure what “measure theory” is but I think it’s some sort of theoretical version of calculus of real variables. I had commented that we’re never sure what to do with our qualifying exam, and Cosma wrote,

I think we have a pretty good measure-theoretic probability course, and I wish more of our students went on to take the non-required sequel on stochastic processes (because that’s the one I usually teach). I do think it’s important for statisticians to understand that material, but I also think it’s actually easier for us to teach someone how a martingale works than it is to teach them to be interested in scientific questions and to not get a freaked out, “but what do I calculate?” response when confronted with an open research problem.
Here it’s been suggested that we replace our qualifying exams with having the student prepare a written review of some reasonably-live topic from the literature and take an oral exam on it, which would be more work for us but come a lot closer to testing what the students actually need to know.

I replied,

I agree that it’s hard to teach how to think like a scientist, or whatever. But I don’t think of the alternatives as “measure theory vs. how-to-think-like-a-scientist” or even “measure theory vs. statistics”. I think of it as “measure theory vs. economics” or “measure theory vs. CS” or “measure theory vs. poli sci” or whatever. That is, sure, all other things being equal, it’s better to know measure theory (or so I assume, not ever having really learned it myself, which didn’t stop me from proving 2 published theorems, one of which is actually true). But, all other things being equal, it’s better to know economics (by this, I mean economics, not necessarily econometrics), and all other things being equal, it’s better to know how to program. Etc. I don’t see why measure theory gets to be the one non-statistical topic that gets privileged as being so requrired that you get kicked out of the program if you can’t do it.

Cosma then shot back with:

I also don’t think of the alternatives as “measure theory vs. how-to-think-like-a-scientist” or even “measure theory vs. statistics”. My feeling — I haven’t, sadly, done a proper experiment! — is that it’s easier to, say, take someone whose math background is shaky and teach them how a generating-class argument works in probability than it is to take someone who is very good at doing math homework problems and teach them the skills and attitudes of independent research.

You say, “I think of it as “measure theory vs. economics” or “measure theory vs. CS” or “measure theory vs. poli sci” or whatever.” I’m more ambitious; I want our students to learn measure-theoretic probability, and scientific programming, and whatever substantive field they need for doing their research, and, of course, statistical theory and methods and data analysis. Because I honestly think that if someone is going to engage in building stochastic models for parts of the world, they really ought to understand how probability _works_, and that is why measure theory is important, rather than for its own sake. (I admit to some background bias towards the probabilist’s view of the world.) At the same time it seems to me a shame (to use no stronger word) if someone, in this day and age, gets a ph.d. in statistics and doesn’t know how to program beyond patching together scripts in R.

P.S. I think measure theory should be part of the Ph.D. statistics curriculum but I don’t think it should be a required part of the curriculum. Not unless other important topics such as experimental design, sample surveys, statistical computing and graphics, stochastic modeling, etc etc are required also. It’s sad to think of someone getting a Ph.D. in statistics and not knowing how to work with mixed discrete/continuous variables (see Nicolas’s comment below) but it seems equally sad to see Ph.D.’s who don’t know what Anova is, who don’t know the basic principles of experimental design (for example, that it’s more effective to double the effect size than to double the sample size), who don’t know how to analyze a cluster sample, and so forth.

Unfortunately, not all students can do everything, and any program only gets some finite number of applicants. If you restrict your pool to those who want to do (or can put up with) measure theory, you might very well lose some who could be excellent statistical researchers. It would be sort of like not admitting Shaq to your basketball program because he can’t shoot free throws.

1. John Johnson says:

I have a theory (!) that you tend to learn from a mathematics course the material from the prerequisite. To this end, I'm glad I took measure theory to understand real analysis and functional analysis to truly understand the notion of subspaces, inner products, orthogonality, and ultimately, spectral theory. It all seemed useless at the start of my career in industry, as did other abstract notions such as probability distributions over groups, but I'm realizing as problems get "weirder," I'm glad to have that deep understanding so that I can, if I need to, build probability models from scratch.

I've taken (and passed) a years worth of measure theory courses, and passed an written prelim on them as well, but I really couldn't tell you (a) exactly what measure theory is or (b) what relevance it has to statistics. Perhaps I'm just a particular bad or lazy student, but my impression is that many students pass through core theory courses with their statistical intuition completely unchanged, just rather frazzled from the whole procedure.

Rather than worrying about the fine points of exactly what is taught, I think it's far more important to worry about how well it is taught. Use assessment that encourages deep learning. Get professors who are both passionate about their subject and passionate about teaching. Encourage regular student and faculty review of the quality of courses, and ensure that the quality teaching is taken into account in tenure.

3. Andrew says:

John,

I agree with your theory about learning the prerequisite. I've been saying this since I was a student, and it's advice I've been giving to students for years. Regarding measure theory itself, I don't personally feel at a loss for never having learned it, but I did take a lot of math in college so maybe I got the necessary background without taking that particular course.

I suppose that any particular subject is useful for some people and not for others. I had a friend in grad school (in the stat program) who told me that he never saw the use of imaginary numbers. My Ph.D. thesis (on image reconstruction) was full of Fourier transforms, so I could assure him that imaginary numbers were indeed useful.

Finally, I agree with you about the importance of teaching but I don't really know what's going to happen anywhere along those lines.

4. Interestingly, we had a debate recently at the ENSAE (Paris) about removing `measure theory' (MT) from the curriculum. In France (and possibly in Germany, Italy, etc.), you're supposed to teach MT, then probability, then statistics (to undergraduates). In England (I was at Bristol 3 years) you skip MT.
Our conclusions was to leave MT, because most of our students end up in Finance, and one cannot deal with Brownian motions and similar objects without a good grasp of MT. But, even if you only deal with standard (real, or discrete) r.v.'s in your career, I still think it's good think to learn a little bit BEFORE learning Prob. Let me give a very simple example. Let X ~ N(0,1) and Y=X if X>0, Y=0 otherwise. What is the density of Y? What is the dominating measure? Without MT, one may guess the answer, but hardly justify it. Try this example with the students, you'll see how they react. (Mine do not understand the intuition, and I need to get back to MT to justify it.)
To go back to our debate at the ENSAE, we kept MT, but changed the objective: teach MT with a strong focus on Probability, e.g. all examples are taken from Probability, etc. Seems to work so far. Note that half our students have a background in Economics.

5. Giovanni says:

Being a CS graduate that is trying to find his way among "complex systems" models and so forth, I found it very useful to study measure theory by myself, many years after I had taken a shaky probability course, during the bachelor.

And when somebody now tells me that a random variable is just a "variable" that takes values randomly, I see in them the me of a couple of years ago… maybe the probability background I had before was really too poor, but I see a serious bug in how probability and statistics are being taught to non-mathematician or non-statisticians. Learning measure in theory, in my case at least, helped a lot.

ciao

G

6. Andrew says:

Nicolas, Giovanni,

7. dcase says:

This discussion is similar to the debate between theory and practice often heard in economics departments. As an econ student in the midst of dissertation research, I often wish that we would have learned more vocation with the theory. Since finishing my first year, I have taken 4 graduate level econometric courses and not one spent anytime pushing us toward developing reasonable programming skills.

Thus, when I sit down to do applied research I know the estimation and testing theory, the asymptotics, and so forth but putting something more complicated than maximizing a likelihood function leaves me initially clueless. Hence, I've spent nearly as much time on teaching myself programming as I have thinking of substantive questions. This struggle certainly has made me a better researcher but having a basic foundation would have certainly allowed me to be able to program more complicated estimators faster and in a more efficient fashion.

8. MDM says:

One of the major gaps, I believe, in my math-stat education was the lack of historical context to what I was taught (and occasionally retained). This was brought home to me when I recently picked up and devoured the book, The Calculus Gallery: Masterpieces from Newton to Lebesgue, by William Dunham (Princeton, 2005). It put the development of measure theory in its proper context, showing how and why it developed.

9. Blaise says:

Suppose a fly is found to be on one wall in a room at a certain time and at a later time it is found to be on a different wall. Assume that the fly cannot have left the room and that its route from point A on the first wall to point B on the second wall must be a continuous curve. What is the probability that the fly's route from the first wall to the second wall involved touching a third wall? Working out this probability involves counting the number of possible continuous curves from A to B and determining the proportion that touch a third wall.

Measure theory is about attaching measures of 'size' to sets. A suitable measure for a finite set would to count the number of elements. For a continuous set which is an interval the length of the interval would be a suitable measure. More complex sets may contain combinations of both intervals and isolated points. Measure theory ideas are fundamental to the calculation of sums and integrals. Probabilities are ratios of measures of sets, so measure theory is fundamental to understanding the mathematical foundations of statistics.

A PhD program that I've always thought would be interesting would be something like pick two out of three from methods, theory and computing. The combination of theory and computing seems particularly undeveloped at the moment, but I think would be rather useful.

On the other hand, it seems difficult to teach modern methods (for modern sized problems) without using computers. Surely the days of teaching a linear models classes using little or no data (i.e. datasets with less than 1000 observations and less than 10 variables) are over? Similarly, why do most classes teach a single inferential technique? Where are the modelling courses that connect frequent, Bayesian and randomisation based approaches for a given type of models?

11. Daniel Lakeland says:

Nicholas:

I disagree that you "need" measure theory for your N(0,1) example. Clearly the density of the distribution is a mixture of 50% 0 and 50% the right half of a normal distribution.

I don't even know what the dominant measure means (unfortunately) but I'm pretty sure that if given the choice of taking a year of measure theory and real analysis vs a year of computer science and applied statistics, I would get more value out of the CS and applied stats. (and I was a math major in undergrad).

12. David Whitaker says:

Bad: asserting that a subject is unimportant when, by your own admission, you aren't even sure what it is.Worse: asserting that a subject is unimportant because you aren't even sure what it is.

13. Andrew says:

David,

I'm not sure if you're responding to me or to one of the commenters, but just in case you're responding to my post:

I did not say anywhere that measure theory is unimportant.

I just said that I don't think it should be required to get a statistics Ph.D. There are lots of other topics (for example, experimental design, sampling, statistical computation, …) which are certainly important in statistics. I think it's great that students have the _opportunity_ to learn measure theory–I respect the opinion of others who say that it's important and useful to them.

But having the opportunity to learn something is not the same thing as being required to learn it, or being kicked out of the program if you can't learn it. I think my Shaq analogy holds up.

14. John Johnson says:

Well, now we get back to what does a Ph.D. mean. To my understanding (having gone through the pain of getting one) it means that you have all of the following:
1) a sound understanding of at least most of the basics of the field
2) a very thorough understanding of a small part of the field
3) ability to catch on fairly quickly to most areas of the field
4) demonstrated an ability to contribute to the field

Certainly measure theory helps (because it is the mathematical underpinning of abstract probability theory), but I have to agree it isn't required at least for applied stats.

For a Ph.D. in probability (especially theoretical), I don't see how you could get around such a requirement.

15. Jason Connor says:

I went through the 1st (required) part of that sequence at CMU that Cosma speaks of.

I understand there is limited time and the faculty must prioritize. Likewise I understand Cosma's concern about the statistician needing a thorough understanding of how probability works.

But perhaps we ought to ask ourselves which of the competing non-statistical topics (measure theory, economics, study design, programming, etc) are most often needed post graduation. For most PhD stat grads perhaps measure theory isn't the answer. In fact how many grads from PhD programs use it post graduation vs. those other topics? I might even wager to say that a more rigorous curriculum in scientific writing and presentation would be more valuable over the course of one's career, though few programs might waste valuable credit hours on such.

To take the contrary view though, perhaps it's easier for a graduate to learn economics or study design or programming or writing/presenting on his own or from his peers in the workplace. It's certainly not easy to learn measure theory on one's own (or with a great professor for that matter).

16. Marcel says:

At the same time it seems to me a shame (to use no stronger word) if someone, in this day and age, gets a ph.d. in statistics and doesn't know how to program beyond patching together scripts in R.

Even scarier is the person who will get their PhD by patching together someone else's WinBugs code…