A colleague recently sent me a copy of some articles on the estimation of treatment interactions (a topic that’s interested me for awhile). One of the articles, which appeared in the Lancet in 2000, was called “Subgroup analysis and other (mis)uses of baseline data in clinical trials,” by Susan F. Assmann, Stuart J. Pocock, Laura E. Enos, and Linda E. Kasten. . . .

Hey, wait a minute–I know Susan Assmann! Well, I sort of know her. When I was a freshman in college, I asked my adviser, who was an applied math prof, if I could do some research. He connected me to Susan, who was one of his Ph.D. students, and she gave me a tiny part of her thesis to work on.

The problem went as follows. You have a function f(x), for x going from 0 to infinity, that is defined as follows. Between 0 and 1, f(x)=x. Then, for x higher than 1, f'(x) = f(x) – f(x-1). The goal is to figure out what f(x) does. I think I’m getting this right here, but I might be getting confused on some of the details. The original form of the problem had some sort of probability interpretation, I think–something to do with a one-dimensional packing problem, maybe f(x) was the expected number of objects that would fit in an interval of size x, if the objects were drawn from a uniform distribution. Probably not that, but maybe something of that sort.

One of the fun things about attacking this sort of problem as a freshman is that I knew nothing about the literature on this sort of problem or even what it was called (a differential-difference equation, or it can also be formulated using as an integral). Nor was I set up to do any simulations on the computer. I just solved the problem from scratch. First I figured out the function in the range [1,2], [2,3], and so forth, then I made a graph (pencil on graph paper) and conjectured the asymptotic behavior of f. The next step was to prove my conjecture. It ate at me. I worked on the problem on and off for about eleven months, then one day I finally did it: I had carefully proved the behavior of my function! This accomplishment gave me a warm feeling for years after.

I never actually told Susan Assmann about this–I think that by then she had graduated, and I never found out whether she figured out the problem herself as part of her Ph.D. thesis or whether it was never really needed in the first place. And I can’t remember if I told my adviser. (He was a funny guy: extremely friendly to everyone, including his freshman advisees, but one time we were in his office when he took a phone call. He was super-friendly during the call, then after the call was over he said, “What an asshole.” After this I never knew whether to trust the guy. If he was that nice to some asshole on the phone, what did it mean that he was nice to us?) I switched advisers. the new adviser was much nicer–I knew him because I’d taken a class with him–but it didn’t really matter since he was just another mathematician. I was lucky enough to stumble into statistics, but that’s another story.

Anyway, it was funny to see that name–Susan Assmann! I did a quick web search and I’m pretty sure it is the same person. And her paper was cited 430 times–that’s pretty impressive!

P.S. The actual paper by Assmann et al. is reasonable. It’s a review of some statistical practice in medical research. They discuss the futility of subgroup analysis given that, compared to main effects, interactions are typically (a) smaller in magnitude and (b) estimated with larger standard errors. That’s pretty much a recipe for disaster! (I made a similar argument in a 2001 article in Biostatistics, except that my article went in depth for one particular model and Assmann et al. were offering more general advice. And, unlike me, they had some data.) Ultimately I do think treatment interactions and subgroup analysis are important, but they should be estimated using multilevel models. If you try to estimate complex interactions using significance tests or classical interval estimation, you’ll probably just be wasting your time, for reasons explained by Assmann et al.

I might be missing something here, but doesn't f(x) = x the solution to the equation f'(x) = f(x) – f(x-1)?

Yeah, you're right. It was something close to this but I'm missing some of the details. Maybe f(x)=0 for x between 0 and 1, then jumped to f(x)=1 at x=1. Then you have f'(x)=1 at 1, and the function goes from there. Or something else, I can't quite rememmber. I wonder where my notes on that are? I wrote the whole thing up at one point, I was so thrilled to have figured it out. The proof was tricky to do from scratch (but I assume it's close to trivial for people who work in that area for real).