“Statistics textbooks (including mine) are part of the problem, I think, in that we just set out ‘theta’ as a parameter to be estimated, without much reflection on the meaning of ‘theta’ in the real world.”

Carol Nickerson pointed me to a new article by Arie Kruglanski, Marina Chernikova, Katarzyna Jasko, entitled, “Social psychology circa 2016: A field on steroids.”

I wrote:

1. I have no idea what is the meaning of the title of the article. Are they saying that they’re using performance-enhancing drugs?

2. I noticed this from the above article: “Consider the ‘power posing’ effects (Carney, Cuddy, & Yap, 2010; Carney, Cuddy, & Yap, 2015) or the ‘facial feedback’ effects (Strack, Martin, & Stepper, 1988), both of which recently came under criticism on grounds of non-replicability. We happen to believe that these effects could be quite real rather than made up, albeit detectable only under some narrowly circumscribed conditions. Our beliefs derive from what (we believe) is the core psychological mechanism mediating these phenomena.”

This seems naive to me. If we want to talk about what we “happen to believe” about “the core psychological mechanism,” I’ll register my belief that if you were to do an experiment on the “crouching cobra” position and explain how powerful people hold their energy in reserve, and if you had all the researcher degrees of freedom available to Carney, Cuddy, and Yap, I think you’d have no problem demonstrating that a crouched “cobra pose” is associated with powerfulness, while an open pose is associated with weakness.

One could also argue that the power pose results arose entirely from facial feedback. Were the experimenters controlling for their own facial expressions or the facial expressions of the people in the experiment? I think not.

Nickerson replied:

The effects *could* be real (albeit small). But this hasn’t been demonstrated in any credible way.

I didn’t understand the “on steroids” bit, either. That idiom usually means “in a stronger, more powerful, or exaggerated way.” I’d agree that the results of much social psychology research seems exaggerated (if not just plain wrong), but I don’t think that is what they meant.

And I followed up by saying:

Yes, the effects could be real. But they could just as well be real in the opposite direction (that the so-called power pose makes things worse). Realistically I think the effects will depend so strongly on context (including expectations) that I doubt that “the effect” or “the average effect” of power pose has any real meaning. Statistics textbooks (including mine) are part of the problem, I think, in that we just set out “theta” as a parameter to be estimated, without much reflection on the meaning of “theta” in the real world.

This discussion is relevant too.

31 thoughts on ““Statistics textbooks (including mine) are part of the problem, I think, in that we just set out ‘theta’ as a parameter to be estimated, without much reflection on the meaning of ‘theta’ in the real world.”

  1. “Statistics textbooks (including mine) are part of the problem, I think, in that we just set out “theta” as a parameter to be estimated, without much reflection on the meaning of “theta” in the real world.”

    Yes I think this is a big BIG problem because when we’re talking about survey sampling from a finite population, then Theta is a well defined thing, but there is no such animal when you’re talking about power poses or medical treatments or cm of predicted sea level rise, or the effect of subsidized preschool on high school graduation rates…. or whatever. It’s just patently *NOT* a random number generation scenario.

  2. I haven’t read the article, but if I only knew the title I would think the article would be a criticism of the methods of social psychology. P-hacking would be equivalent to juicing. Researchers who juice (p-hack) have an unfair advantage over researchers who don’t.

    Of course though, if a researcher/athlete like Wansink/Barry Bonds doesn’t know they are p-hacking/juicing, they can’t be held accountable and we should hold all of their former accomplishments in high regard.

    • Jordan:

      From the article: “Among the good news for social psychology is the considerable outreach by social psychologists to general audiences, and the penetration and impact of social psychological ideas on the popular culture. This is manifest in the increasing number of popular, so called trade books, op-eds in major newspapers (like The NY Times, The Guardian, or the Washington Post), media appearances (even commercials!), public speaking such as at TED conferences, where social psychologists give some of the most popular talks of all times, and participation in political campaigns and political advisory boards.”

      To be fair, they also point out some minuses: “The considerable publication stress combined with increased public attention to social psychological findings may have introduced another questionable trend into our scientific practices, namely, the tendency to privilege research that presents surprising, unusual, or hilarious findings rather than ones that make substantive contribution to knowledge. . . . the penchant for unobvious, difficult to predict findings seems to have remained ‘in our blood.’ Combined with the media thirst for sensationalism and our growing appetite for media attention (encouraged by major academic institutions!), this led to increased emphasis in the field on studies that claim ‘magical’ findings that would seem highly surprising and counterintuitive to intelligent lay readers.”

      I give the authors credit for recognizing both positives and negatives, but I’m unhappy that they don’t seem to reflect that some of what they consider as “positives” might actually be negatives.

      I do like this thing that they write: “some of the proposals described above, albeit with some tweaking, may actually remove the need to produce the unrealistic ‘picture perfect’ results that our major journals have been requiring so far. Specifically, we find promising the suggested publication of papers based on their theoretical rationale and their methodology rather than on their results. This might mean either pre-approval of research for publication on that basis or the result-blind review of submitted papers.”

      But I’m concerned that they don’t recognize how bad things are. All the preregistration in the world won’t help you if measurements are noisy, theory is weak, and effects are highly variable and context-dependent (which describes some of the examples of research discussed in that article).

    • P.S. I don’t think Barry Bonds and Brian Wansink are comparable. The consensus, I believe, is that Bonds even without juicing would have been one of the greatest baseball players of all time. Wansink’s more like that woman who cheated in the Boston Marathon and won by riding the subway part of the way.

      • I only used Barry Bonds because I think he claimed he didn’t know he was using performance enhancing drugs. I think he claimed his trainer had him use some cream, but he didn’t know what it was.

        Similarly Wansink has claimed he didn’t know what p-hacking is (Wansink also has engaged in a variety of other “illegal” practices–it would be like if Barry Bonds also corked his bat, stole signs, bet on games, etc.).

        • Jordan:

          You write, “it would be like if Barry Bonds also corked his bat, stole signs, bet on games, etc.” I would add: And if Bonds couldn’t’ve actually made contact with the damn baseball without cheating. And if it turned out that 761 or so of the home runs recorded by Bonds in the record book never actually cleared the fence in real life.

  3. I’ll be honest: I find it difficult to follow the tone of this paper.

    Toward a New Dawn in Social Psychology: We Shall Survive!

    In these days of “methodological crisis,” dark clouds gather on our horizon that threatens to disempower and belittle our field. Colleagues feel alarmed and discouraged by the merciless criticism to which social psychology (and some investigators personally) are subjected (e.g., Fiske, in press; Gelman, 2016; Simmons & Simonsohn, in press). Nonetheless, it is important to peer beyond the distressing current context and see the “silver lining” that the menacing clouds may contain.

    Followed by this …

    Furthermore, a renewed emphasis on theory and method, rather than predominantly on results, may sharpen our theoretical acumen and the attention paid to the theoretical and historical justification of our research. The recent “fury” about non-replication will, hopefully, subside, through the realization that what should replicate are the invariant psychological principles rather than sensationalized effects likely to manifest only under esoteric circumstances.

    Critiques leveled against our field might hopefully curb our enthusiasm about the quantity of our publications, citations, and so forth and refocus our attention on the quality and depth of our work. We may want to enrich our graduate training with courses and workshops on theory construction (Kruglanski & Higgins, 2004), thus encouraging students to develop substantive theoretical frameworks rather than simply carrying out “fun” or “cutesy” experiments with no theoretical basis.

  4. I, OTOH, took the title to indicate that the field was full of life and coming up with large numbers of great results.

    If we could get such opposite meanings out of the title,maybe it’s not the best title.

  5. I like to think of ‘parameters’ in a statistical model function as arbitrary labellings of model instances. ‘Meaningful’ or ‘interest’ parameters are then defined via functions or functionals of theta ie g(theta). The analyst has to think of and justify why g(theta) defines something of interest.

    Similarly for data y and interest statistics t(y). The former is not intrinsically meaningful (ie ‘has no semantics’). The meaning comes from the definition of a feature of interest.

      • I just mean something like:

        There are many ways of writing down a ‘syntactically valid’ statistical/probabilistic model connecting data and parameters, but no guarantee within statistics/probability theory as such that this model actually carries any ‘meaning’ (semantics) or connection to the ‘real’ world.

        As Andrew said:

        > [in] Statistics textbooks…we just set out ‘theta’ as a parameter to be estimated, without much reflection on the meaning of ‘theta’ in the real world.

        I think of statistics as a sort of syntax in need of something extra to relate it back to the real world (e.g. what some call ‘causal’ assumptions etc. Mud doesn’t cause rain and all that). The above is a rough version of how I think of the ‘something extra’, though there are many other, possibly better, ways.

        • Yes, I definitely see what you mean. And it’s not just in “statistics” things like fluid mechanics or other mathematical models have this same issue… you can write down equations to solve, but there’s no real reason to think the equations necessarily mean anything unless you have “something else”, basically a justification.

          I feel like this is one of the things I like least about Frequentist statistics. The underlying theory is “everything is from some distribution” which is kind of trivially true, or trivially impossible to disprove, but anyway meaningless in many cases.

        • I think we need to teach more about the act of modelling, including a critical attitude about how the model relates (or does not relate) to the modelled reality, and this holds equally for frequentist, Bayesian or whatever statistics.

        • Yes definitely.

          But a sort of corollary of my view above is that you don’t necessarily have to build the semantics into the statistical model in the first place. It can come later, or in some sense ’emerge’ from what seems to be a purely syntactical model. I’m thinking neural nets and all that.

          One issue is that all statistical models are pretty poor physical models unless combined with concepts from the sciences like ODEs etc. DAGs are sort of useful but, for someone with a physical modelling background, strike me as pretty artificial (can you imagine and engineer or physicist writing structural causal models?).

          So, you either incorporate it via scientific models – which would require stats students to take differential equations etc – or you figure out how to extract some meaning a posteriori. The latter is probably a more machine learning style approach which has had _some_ successes I suppose.

          But what I don’t want is stats students thinking linear regression models are the be all and end all of ‘modelling’.

        • What I don’t want is stats students (and researchers in fields using stats or other types of models) thinking that _any_ particular type of model is the be all and end all of “modeling.”

          One example that comes to mind is network analysis. For example, results from network analysis depend very much on the sample at hand. It’s doubtful that you can obtain credible results for some problems (e.g., epidemiological) unless you include the entire population of nodes and links.

        • Yes definitely. But it seems to be a real issue that engineering and physics students usually learn a bit of regression and data analysis but statistics students are less likely to learn differential equations etc. So they don’t really seem to get a broad perspective on modelling by default.

          In this case how do you teach stats students what ‘theta’ represents? It strikes me that there are two approaches – purely empirical modelling/machine learning where a priori interpretation of theta is abandoned (but maybe added a posteriori) or more effort connecting it from the beginning with scientific modelling.

        • I agree that parameters should be clearly connected to a scientific model from the beginning. A posteriori interpretation might sometimes be OK — but only if by someone who really understands the situation being modeled; in general, just the idea of a posteriori interpretation makes me cringe — too much chance of making up “just so” stories.

        • > just the idea of a posteriori interpretation makes me cringe — too much chance of making up “just so” stories.

          Sure – but for this approach (the statistical or empirical model) I’m more saying you might have to give up interpreting theta entirely and only _maybe_ entertain a posteriori interpretation. This being what is done for e.g. neural networks.

          You can still test predictions empirically but individual parameters themselves are fairly meaningless.

          Point being, for many empirical/statistical models there might simply be no good interpretation of theta. Statistics is in a kind of weird spot of neither going full blown ‘theta is meaningless’ nor ‘theta should be based on sound scientific principles’.

        • Another thing that might be a bit unclear is that by ‘a posteriori’ I don’t really mean ‘post data’ but more ‘additional annotation or interpretation’. Theta is meaningless but g(theta) could add additional interpretation regardless of data.

        • A nice quote from J. W. Gibbs (1902):

          It is in fact customary in the discussion of probabilities to describe anything which is imperfectly known as something taken at random from a great number of things which are completely described.

        • Martha: But perhaps also, pragmatics – the purposefulness of the representation or the upshot of it or precisely and completely what it is for something to mean something (which always remains open ended).

          ojm: ” reflection on the meaning of ‘theta’ in the real world.”
          I prefer to put that in terms of how the representation represents something in the real world in some sense for some interpreter. But then, I started out in semiotics rather than physics or math ;-)

        • Keith: agree with both points, I think. Probably what I am gettting at is closer to pragmatics than semantics in that it is explicitly context dependent but I don’t know enough linguistics/semiotics to use the terms precisely.

  6. This be on the test.

    They fuck you up in intro stat
    They may not mean to, but they do
    A bunch of old, outmoded tat
    With clicky apps to make it new

    But they were fucked up on their path
    (PROC STEPWISE, Kruskal-Wallis test)
    Where stat was just a branch of math
    No ‘theta’ better than the rest

    Man hands down mystery to man
    It tangles like a rotting vine
    Get through as quickly as you can
    Then learn yourself some stats on line.

    • Nice!

      Although, I think the last line needs some revision.

      Stats is so hard to teach because its awfully hard to grasp and only a few likely really have but then only in limited areas – and it is hard for those who don’t almost understand stats to have any sense of who those might be…

      As Hinton (https://en.wikipedia.org/wiki/Geoffrey_Hinton) once said – “You have two choices: Learn statistics or make friends with a statistician. Don’t expect me to say in public which I found harder.”

      Now part of this might be because I took “stats on line” to be online courses and materials rather than a blog like this.

      “Then find insightful stats blogs to mine.”?

  7. Started reading BDA this week and to your credit, you mention that conditioning on a hypothesis is an implicit fact of probability calculations. I don’t think much more emphasis could be put on this point without going on a tangent.

Leave a Reply

Your email address will not be published. Required fields are marked *