A few months ago, I blogged on John Gottman, a psychologist whose headline-grabbing research on marriages (he got himself featured in Blink with a claim that he could predict with 83 percent accuracy whether a couple would be divorced–after meeting with them for 15 minutes!) was recently debunked in a book by Laurie Abraham.
The question I raised was: how could someone who was evidently so intelligent and accomplished–Gottman, that is–get things so wrong? My brief conclusion was that once you have some success, I guess there’s not much of a motivation to change your ways. Also, I could well believe that, for all its flaws, Gottman’s work is better than much of the other research out there on marriages. There’s still the question of how this stuff gets published in scientific journals. I haven’t looked at Gottman’s articles in detail and so don’t really have thoughts on that one.
Anyway, I recently corresponded with a mathematician who had heard of Gottman’s research and wrote that he was surprised by what Abraham had found:
I [the mathematician] read one of his books a while ago — not about this predictive stuff — and found the level of mathematical sophistication quite high. I’m not quite sure what to think: to be honest, given what I know of Gottman, it’s really hard to imagine him making such a mistake! Anyway, it’s terrific math journalism that Abraham dug this up though.
I think she drops the ball, though, in the section you quote about false negatives and false positives. True, Gottman could mean a lot of things by “80% accuracy.” But if he’s trying to predict early divorce, an event with 16% prevalence, I think it’s safe to say he does NOT mean “My method assigns a prediction of divorce to 20% of couples that didn’t actually divorce” — presumably the TOTAL number of positives his method gives is somewhere around 16%. A much more natural interpretation of what he might mean by “80% accurate” is “the method gives the the right answer in 80% of cases” — of course, it would be richer information to report the false positive and false negative rate separately.
I do think that Abraham was winging it with her numbers, but I can’t say I’ve heard anything positive about Gottman’s claims of 85% success etc. I haven’t looked into this in any detail and would be happy to be proved wrong on this, but right now it doesn’t look so good to me. If you know more on this, please keep me informed.
The mathematician then replied to me:
A psychologist colleague convinced me my initial reaction was too hard on Gottman. I looked at the actual papers a bit; it’s a bit hard for me to make out exactly what he did (but I think this is just because I don’t ever read psychometric stuff) But here’s my general sense of things, which I think places Gottman in a better light without anything Abraham says being factually wrong. (But again, the following is my very casual understanding of what happens in Gottman’s papers, and could be inaccurate.)
So in the first paper (where the 80% figure comes from) he videotapes the couples and measures lots of kinds of interactions. He takes some smallish set of measured variables x_1, …. x_n which represent one hypothesis about predictors of divorce, and then another set y_1, … y_n which represent a competing hypothesis (the one he favors.) He finds the optimal linear combination of x_1 … x_n for predicting divorce, and similarly for y_1, … y_n. (I guess this amounts to something like the usual problem where you have a bunch of red dots and a bunch of green dots in R^n and you want to find the best separating hyperplane.) And he finds that y_1, … y_n do much better.
He doesn’t report rates of false positives and false negatives separately, but he does report correlation between his linear combination of y_1, … y_n and (I think) the Bernoulli variable of divorce/not divorce, and gets something highly positive (with x_1, … x_n, it’s just slightly positive.) So that’s not consistent with, say, his model always predicting no divorce, or predicting that a randomly chosen 16% of the couples will divorce, which would give correlations of 0. It IS consistent, I think, with numbers like Abraham’s, in which half of the couples he predicts will divorce actually stay together. But as I said, I don’t think this is an unimpressive result!
Anyway, in his later papers, it looks like he carries out the same kind of analysis on different data sets. I think what Abraham is complaining about is that he recomputes the coefficients of y_1, … y_n in each paper. But I don’t think this really counts as remaking the model each time to fit the data; he has committed to this small set of variables, and replicates the finding that these variables explain more of the variation in some variable of interest than do competing sets of measurements. That’s legitimate, right? (That’s an authentic question, I’m a non-statistician.) Of course, it would be weird if the coefficients were completely different each time, but one presumes he checks that. I’m just saying that it doesn’t seem to be methodologically necessary that he commits himself after his first paper to a specific linear combination of y_1, … y_n which is to be his divorce prediction variable for all time.
1. A relevant question is: if you have a binary variable, and n other measurements, presumably you should expect to be able to find SOME linear combination of y_1 … y_n that has a healthy positive correlation with the binary variable. But how positive a correlation does one expect to find, for a given n, under a null hypothesis that the y_i are actually independent from the binary variable? That question seems relevant (and I get the sense from what Gottman writes that he knows the answer to this question, but I don’t. Part of the problem I guess is that his y_i are presumably correlated with each other and I guess that affects the answer.)
2. I guess my sense after looking at this stuff is that there’s really nothing wrong with Gottman’s published work, but that there is something wrong with Gladwell’s description of it, and to be honest, probably something wrong with Gottman’s own non-academic sales pitch for his own results. I think the criticism would be much muted if he said some form of “these variables explain x% of the variance” instead of using the loaded word “predict” — but my colleague tells me the word “predict” is standard usage for this, at least among psychologists. I think the correct statistical critique is just the very basic one that the claim as written is empty: that if you want to predict whether a couple will divorce or stay together with 80% accuracy, you can just predict every couple will stay together. But I think Abraham goes too far when she says that Gottman’s work is somehow in conflict with the scientific method.
To which I replied:
There’s a kind of innumeracy you sometimes see with mathematicians, where they don’t connect numbers directly to real-world outcomes.
To a statistician, it’s obvious that you can predict divorces to high accuracy over a 3-yr period by just saying that everyone will stay married, and those numbers such as 80% or 93% will raise suspicion. But to a certain kind of mathematician, these are numbers with no real-world context. Recall this quote from Gottman’s collabroator, James Murray:
“The forecast of who would get divorced in his study of 700 couples over 12 years was 100 per cent correct, he said. But “what reduced the accuracy of our predictions was those couples who we thought would stay married and unhappy actually ended up getting divorced”.”
To this guy, you can do better than 100%!
To put it another way: Gottman’s work could very well be useful, even if he doesn’t predict divorces at all, he could be developing good techniques for marriage counselors. After all, most counselors don’t try to predict at all, and we don’t consider that to be a problem. But, at the very least, Gottman doesn’t seem to mind the praise he’s received for this work.
I can’t really say more without actually reading the scientific publications. . . .