Priors I don’t believe

Posted on May 6, 2014 9:10 AM by Andrew

Biostatistician Jeff Leek writes:

Think about this headline: “Hospital checklist cut infections, saved lives.” I [Leek] am a pretty skeptical person, so I’m a little surprised that a checklist could really save lives. I say the odds of this being true are 1 in 4.

I’m actually surprised that he’s surprised, since over the years I’ve heard about the benefits of checklists in various arenas, including hospital care. In particular, there was this article by Atul Gawande from a few years back. I mean, sure, I could imagine that checklists might hurt: after all, it takes some time and effort to put together the checklist and to use it, and perhaps the very existence of the checklist could give hospital staff a false feeling of security, which would ultimately cost lives. But my first guess would be that people still don’t do enough checklisting, and that the probability is greater than 1/4 that a checklist in a hospital will save lives.

Later on, Leek writes:

Let’s try another headline: “How using Facebook could increase your risk of cancer.” Without looking at the study, I’d probably think “no way.” To my mind, the odds that this is right may be something like 1 in 10.

Whoa. He’s saying that his prior probability of this happening is as high as 1/10? That’s only 1/2.5 his prior probability that a checklist will save lives.

Here we can see one of the problems with subjective priors. It’s hard to get the scale right. I’m reminded of what George Orwell wrote about book reviewing: if you review 10 books a week, and if your scale is such that Hamlet is a good play and Great Expectations is a good read, how to you calibrate all the material of varying quality that you are sent to review? The only answer is that books are reviewed relative to expectations, and you can’t say that the latest bestseller is crap just cos it doesn’t live up to the standards of Shakespeare.

Similarly, I have a feeling that Leek is setting his priors relative to expectations. In his first example, sure, we have a general belief that checklists are important, but Leek compresses his scale by invoking a general skepticism. So, instead of saying that checklists probably work, he dials down his probability to 1/4. Now to the second example. Of course using Facebook does not give you cancer. But we can’t just set the probability to 0. Indeed, even thinking about the question implies that the probability is nonzero, and then we get to thinking: hmm, you use Facebook and you stay indoors more, then maybe you don’t get enough exercise or not enough vitamin C . . . ok, maybe it’s possible. And that gets you to the probability of 1/10.

But this 1/10 is not on the same scale as the earlier 1/4. The 1/4 referred to the probability that a checklist really works to save lives, whereas the 1/10 is the probability that there’s something, somewhere associated with Facebook use that is also associated with cancer risk in some small way (as there’s no realistic way this effect could be large).

This illustrates a general problem, not just with priors and Bayesian statistics but with scientific measurement in general. It’s hard to talk about probabilities, or any other numerical measurements, without some definition of what is being measured. (As regular readers of this blog know, similar problems arise with p-values.)

I respect that Leek was writing a general-interest article on a news website and so he had to simplify. My point is not to pick on him but rather to bring some attention to the general problems of probability assignment. It’s easy to say that we think a treatment works or doesn’t work, or that a certain pattern is real or not, but when we start to assign probabilities, I think we need to think more carefully about what are the events we are referring to.

36 thoughts on “Priors I don’t believe”

brendan on May 6, 2014 9:35 AM at 9:35 am said:

So, said differently, to talk coherently about the impact of any thing you a clear alternative thing in mind.

The impact of checklist use is measured relative to not using a checklist- simple. X or not X.

Whereas Facebook use is compared to…what? Twitter use? Gaming? Playing outdoors? A weighted average basket of time spent by people doing other stuff.

So the issue is that “Not X” isn’t defined in the Facebook case. That right?

Reply ↓
- Andrew on May 6, 2014 10:23 AM at 10:23 am said:
  
  Brendan:
  
  What you say is part of the story. But the other part is that “increase your risk of cancer” is not defined quantitatively. I can only assume that “increase” would only count if it is larger than some threshold (for example, an increase of 10^-6 in lifetime cancer risk would not count), but that threshold has not been defined. By saying this, I’m not trying to be a pain in the ass; I just don’t see how it makes sense to assign a probability number to an event that it so vaguely defined.
  
  Reply ↓
Quartz on May 6, 2014 9:39 AM at 9:39 am said:

Amazing: “…Facebook use that is also associated with cancer risk in some small way (as there’s no realistic way this effect could be large).” Such size is a very strong assumption (and therefore a loose prior), how would we ever “know” that? You *wish* it, that’s different. (For one I do not discount the association possibility -having collected quite some material on the field-, and could add a few other causal paths.)

I guess it’s easy to have similar attitude towards things like http://science-beta.slashdot.org/story/14/04/28/1823207/male-scent-molecules-may-be-compromising-biomedical-research
“OMG, we can’t allow this to be true!” (regardless of actual study quality).

Anyway, what if the FB association to cancer risk was merely to anticipate it in an otherwise “predestined” subject? When would you call it significant? Days? Weeks? Months? (the “otherwise” is indeed tricky to specify, but let’s skip it for now…).
Only after that choice one shall ponder how high is risk for each one of us “someday”.. and draw conclusions. I bet the result would then differ.

Reply ↓
- Andrew on May 6, 2014 10:20 AM at 10:20 am said:
  
  Quartz:
  
  1. Regarding your first paragraph: No, I was not writing about what I “wish” but rather what I think is true. The point of my post is that if you want to be quantitative about prior probabilities, you should be quantitative about what you are assigning probabilities to. If you want to say Pr(A) = 1/4, then it’s a good idea to give the event “A” a quantitative definition.
  
  2. Regarding your last paragraph: Yes, as new information comes in, the probabilities will change. That’s how it’s supposed to go!
  
  Reply ↓
  - Quartz on May 7, 2014 5:17 AM at 5:17 am said:
    
    Sorry if I did not comment on the main point and went so much OT. I totally agree with the goal of the post… (and that’s why there was little to add, as is usually the case) ..but found the minor digression on cognitive biases also worth a look, for those interested in the field. It’s revealing how even bayesians can have such a strong tendency towards informative priors. In that context the last part dealt with the same information leading to different conclusions (=priors), depending on how the issue was framed.
    
    I’ll try to stick to stats, and this post also suggested me some developements.
    
    Reply ↓
- Daniel Lakeland on May 6, 2014 10:56 AM at 10:56 am said:
  
  “large” might be something like 1/3 per year, like maybe last-ditch efforts to clean up a leaking radioactive power plant. We KNOW that FB-cancer risk is not “large” because otherwise we’d be walking through corpses in the streets.
  
  All this is to fully agree with Andrew, and to once again emphasize something I’ve been saying for a long time. Every model, including statistical regression models etc, should be specified in dimensionless form, where every quantity is actually a ratio of the actual quantity to a carefully chosen reference level.
  
  Sometimes just the act of deciding on the reference levels is the most important part of the analysis. Also, in doing this you can eliminate extraneous coefficients and simplify your models.
  
  Reply ↓
  - Richard McElreath on May 6, 2014 3:26 PM at 3:26 pm said:
    
    Daniel: I’ve been trying to come up with a compelling and easy teaching example of exactly that, dimensionless regression analysis. Do you have some favorite examples?
    
    Reply ↓
    - Daniel Lakeland on May 6, 2014 9:37 PM at 9:37 pm said:
      
      I don’t have one offhand, but anything that is a “good” regression example with at least 2 or more covariates, could probably be usefully turned into a good dimensionless regression example by doing the dimensional analysis on the whole thing. Showing how that’s done and how it moves uncertainty around is worthwhile.
      
      I think timeseries are probably a good place to look as well. I’ve seen models in which what looks like uncertainty in a coefficient can be re-interpreted by defining the coefficient to be 1, and considering uncertainty in the duration of the experiment in dimensionless time. That can be a very useful way of looking at things.
    - dab on May 7, 2014 8:01 PM at 8:01 pm said:
      
      Would you consider this paper to be a good example of what you’re talking about?
      
      http://homepages.mcs.vuw.ac.nz/~vignaux/docs/maxent.pdf
    - Daniel Lakeland on May 8, 2014 4:48 PM at 4:48 pm said:
      
      Looking at the abstract and first few paragraphs, this article is definitely talking about the same thing. The author is emphasizing the variable reduction / symmetry analysis (Buckingham Pi theorem) aspect of the problem. Towards the end there’s some stuff about the fact that you can choose a variety of different dimensionless groups, which you can construct by taking products and quotients of the groups. In general I would say constructing the dimensionless groups should be done as PART of the modeling procedure, not as a mechanical procedure from which you then build the model.
      
      I’m going to plug an educational article by one of the mildly frequent commenters here David Hogg http://arxiv.org/abs/physics/0412107
      
      as well as a decent book by Mahajan (Hogg’s coauthor) http://mitpress.mit.edu/sites/default/files/titles/content/9780262514293_Creative_Commons_Edition.pdf
    - Daniel Lakeland on May 8, 2014 4:54 PM at 4:54 pm said:
      
      The fact that this has traditionally been used in the realm of physics and physicsy applied-math doesn’t mean it’s useless outside those fields. But it is *most* useful when you are measuring physical quantities and therefore dealing with fundamental dimensions where the results are invariant to rescaling. Things like “on a scale of 1 to 10 how happy are you today” are not fundamental physical quantities and the measurements are not invariant to rescaling. (if you ask “on a scale of 1 to 100 how happy are you today” or “on a scale of 1-5 how happy are you today” you won’t necessarily get rescaled versions of the first response. Also, how these scales are interpreted will vary from population to population. Not so true with things like “how many pounds/grams/kg/oz of raw sugar did you consume last month?”)
    - Daniel Gotthardt on May 8, 2014 5:22 PM at 5:22 pm said:
      
      I really do not like the “how happy are you today”-version which introduces unnecessary amounts of noise. I also generally prefer questions asking for satisfaction instead of happiness. Regardless, I understand why rescaling is more difficult with such quantities.
      
      Street-Fighting mathematics sounds like an awesome book. Thank you for the link!
  - jrc on May 6, 2014 7:16 PM at 7:16 pm said:
    
    Daniel,
    
    I think I agree with you, but I want to make sure. So two quick examples, and maybe you can re-configure them to match your idea of a dimensionless regression with coefficients that can be interpreted as “relative to some group”.
    
    1 – Consider a covariate X that is distributed N(0, sigma), and Y = constant + Beta*X + noise, for some Beta that isn’t 0. Does the intercept here, by de-meaning the data, pass your test? Or what would you have to do?
    
    2 – Suppose a random control trial with binary treatment T and outcome test score Y measured in units of standard deviations of the control group. We regress Y on a constant and a treatment dummy. We interpret our coefficient on T as the marginal effect of treatment – which is to say, the treatment effect is how many SDs higher the mean outcome score is in the treatment group _relatvive_ to the control group.
    
    In my mind, all of these regression coefficients are “relative” to some group and that group is clearly defined – either those with X=0 or those with T=0. I am sure we could be more clear about this in our writing, but I think there is something substantive I’m missing in your idea. Or is it just about getting the units right – by choosing X dist N(0,sigma) (in the first example) and using the control group distribution of scores as a standardization (the second example) I am tacitly embracing your position?
    
    Reply ↓
    - Daniel Lakeland on May 6, 2014 9:20 PM at 9:20 pm said:
      
      Consider case 1, Y has some dimensions, we can choose a scale factor to create a “scaled Y”, for example we could express Y relative to it’s average value
      
      Y/C = 1 + Beta/C X + noise_2
      
      It may seem that this is meaninglessly the same. But consider that X has some units as well. Over what range does X vary? Suppose that there’s some scale X_0 so that X/X_0 is O(1) for almost all achievable values of X. Rewrite the equation:
      
      Y/C = 1 + Beta*X_0/C (X/X_0) + noise_2.
      
      Now Beta * X_0/C must be dimensionless and is called a “dimensionless group” in most applications in physics (examples include the reynolds number, the mach number, Peclet number etc).
      
      The new noise is dimensionless as well, but its scale can be interpreted in terms of “as a fraction of C”. Note that we’ve intentionally chosen X_0 such that X/X_0 is “about 1”, so the size of Beta*X_0/C is determining how important X is in the overall outcome.
      
      Your model now reads “Scaled Y is about 1 plus a correction of approximate size Beta*X_0/C plus some noise of size S (where S is sigma(noise)/C)”.
      
      Consider case 2, this is the same model as above, but you’ve already made your model dimensionless. In fact instead you’ve done:
      
      Y/S = C/S + Beta * T + noise(sigma == 1) + noise2(sigma == ??) T
      
      where T is already O(1) because it’s either 0, or 1, Beta is already dimensionless, and S is the standard deviation of the control group. So when T=0 the noise will have a stddev DEFINED to be 1. Instead of defining the constant offset to be 1 like the first model. If the treatment group is noisier, then the size of sigma(noise2) is automatically telling you how much noisier than control by definition.
      
      Playing around with the scales of the various variables can greatly improve the theoretical interpretability of a model. When things are scaled to be “about the size 1” or to vary over “a range of about 1” you can immediately see by the size of the coefficients whether the effects are small, medium, or large. Any coefficient whose size is much smaller than 1 is “small”, anything about 1 is “medium” (as big as other important things you’ve already scaled to be 1) and anything much larger than 1 is probably surprising.
      
      In certain cases where things like the scales for X_0 or sigma are already approximately known, just doing this analysis can immediately tell you a lot about what you should expect from the data before you even see data.
    - Daniel Lakeland on May 6, 2014 9:31 PM at 9:31 pm said:
      
      This “trick” does more than just make your regression coefficients have easily interpreted magnitudes. In some cases by carefully defining scale factors, you can eliminate the need to estimate a coefficient at all. By defining your units in such a way that some unimportant coefficient is 1 by definition, you can then wind up with statements like:
      
      There exists some unknown units in which we could measure Foo such that the average value of Foo is 1, and the effect of the rescaled Bar and Baz are 1.2 +- .1 and 2.4 +- .2 and so therefore Baz has the larger effect… which maybe is all you wanted to know.
      
      This whole scheme also really helps when assigning priors. It’s a lot harder to assign a prior to say the absolute mass of some toxin in the diet than it is to say the relative amount of that toxin compared to some other known non-toxic trace element for example the quantity of lead in the diet as a fraction of the quantity of calcium. I have no idea how many milligrams of calcium there are in a typical diet, but I am pretty damn sure that almost every single person on the face of the earth consumes less than 1/10 as much lead as calcium.
    - jrc on May 7, 2014 2:40 PM at 2:40 pm said:
      
      Thanks Daniel. That was really helpful for me. I’ll have to do some more thinking about it (in particular, sometimes I really want coefficients with units – but those could be backed out later I guess), but I do see the parsimonious beauty of the procedure and the response was really clearly laid out. I’ll add this thinking to my toolkit – much appreciated.
    - Daniel Lakeland on May 7, 2014 2:49 PM at 2:49 pm said:
      
      No problem. Glad you liked it, and YES when you actually want to report units, you can always multiply your dimensionless quantity by the scale factor expressed in whatever units you like.
      
      Hoping to see you down here in SoCal soon. We can talk models, dimensionless analysis, and dynamics and soforth in person.
    - Daniel Gotthardt on May 7, 2014 3:12 PM at 3:12 pm said:
      
      I’d like to thank you, too. My methodological upbringing biased me against standardizing “because the original scale is more meaningful” but your arguments for dimensionless forms is quite convincing (and the “counterargument” never sounded very reasonable). I hope I will find some time to think more about it but thanks for pointing into that direction.
    - Daniel Lakeland on May 7, 2014 5:31 PM at 5:31 pm said:
      
      One of the major reasons to prefer dimensionless forms is that the units tell you nothing, they are arbitrary and usually defined in such a way as to be reproducible (the meter, the second, Kelvin etc) whereas expressing things in “natural units” which are “natural to the problem” puts everything into a perspective built FOR the problem.
      
      Finally, because everything you find out about the world needs to be true independent of the units you use, there is a symmetry which can be exploited to reduce the dimensionality of many problems. The classic examples of this are for example with drag coefficients. If you think of the drag on an object as a function of say it’s shape, it’s velocity, the density of the fluid, the viscosity of the fluid, etc. Then it seems like the drag is a fairly complicated function. But in truth, by exploiting symmetry we find out that drag is a function of shape of the object and reynolds number (a dimensionless group). So for a given object, we only need to collect data about a range of Reynolds numbers, a SINGLE parameter.
      
      This same kind of thing IS exploitable in statistical regression, but as mentioned above I don’t have a great example offhand. I should probably think up one and stick it on my blog. If I do, I’ll try to get Andrew to link to it.
    - Daniel Gotthardt on May 7, 2014 6:59 PM at 6:59 pm said:
      
      (This is a reply to #comment-161099 from Daniel Lakeland)
      
      I’d be very interested to read that! While I do think that I can understand the reasoning for models in physics (and I remember some discussion about it in a “mathematical modeling” course I took), I wonder if and how far it might be a useful idea for Social Sciences, too. There’s always the question what actually constitutes a strong effect. Let’s say Y is a subjective evaluation of one’s own “satisfaction with life” on a scale from 1 to 10. This is *the* primary dependent variable in the “Quality of Life”-Research on which dozens if not hundreds of regression analysis with all kinds of more or less well reasoned models have been done. The question is also part of most of the Social Surveys I know.
      
      Sometimes there is some kind of intuition about what constitutes a strong or medium effect but mostly people are just discussing signs and significance. If you’re lucky, articles will mention when an effect is statistically significant but practically irrelevant. If you’re really lucky effect sizes will be discussed relatively to some meaningful differences of the independent variables and so there is some – intuitive – idea of relative effect sizes between different independent variables in a model.
      
      In a project I’m working on currently the dependent variables are different kind of “moral attitudes”, also measured on a 1-10 scale and I’m really struggling to think about what constitutes “practically relevant” effects, especially as other research as far as I know just ignores this question! But I also think that this might constitute a problem for dimensionless forms, too. For example moral attitudes toward legality and honesty do not vary a lot between persons (or countries) but moral attitudes toward sexual issues vary much more. I do think that this is a quite meaningful difference and I don’t want to lose that information if I model both variables in separate regressions, so standardizing Y does *not* seem to be a good idea for me but maybe other kinds of dimensional reduction (?) might still be a good idea.
      
      Anyway, dimensionless forms seem to be a possible way to help with this quite dissatisfying way things are but I fear there won’t be many scholars who would want to try to understand this. Quality of Life research is usually done by the more method-savvy crowd and even there I don’t think many would be interested. Reading how many blatantly wrong use of models and interpretation are published in Sociology makes me skeptical that more advanced and more formal-mathematical approaches will practically help the Social Sciences. I’m still very interested if you should write something about the dimensionless forms in your blog!
    - Daniel Lakeland on May 8, 2014 1:14 AM at 1:14 am said:
      
      Daniel Gotthardt:
      
      If you’d like to discuss the use of dimensionless analysis in particular social science contexts I would be willing to use your “quality of life” type examples.
      
      One thing to note though, there is no such unit-symmetry in 1-10 scales for social “quality”. Unlike measurements in inches vs meters, the results of the measurement WILL depend on the choice of scale and how it’s interpreted by the questioner. So in that type of question, the symmetry analysis may be less compelling. However, if you are interested in something like dependence on age or on years since some life event, or on how tall a person is, etc. Those things DO still have the unit-independence symmetry.
      
      In any case, I think it’s very useful to think about the variation in “moral attitudes” about sex on a scale that is determined by some naturally near-constant thing like say honesty. The variability in honesty, or the average of honesty and legality or whatever provides a scale for “typical” variations. A regression in which the variation among countries on these factors is constrained to be 1 by rescaling will allow you to determine via a *useful* yardstick whether say sex variation is practically large (>1) , typical ~ 1, or small (< 1).
      
      This is the essence of rescaling, to create a yardstick that measures in units that are relevant to the problem at hand.
  - Quartz on May 7, 2014 5:53 AM at 5:53 am said:
    
    Sorry Andrew, but just to close this… Daniel: the comment also dealt with time delay of effects. Agreed that if FB worked like a bullet (both immediate and “large” after your labeling, for a case labeled “enormous” in my scale) we’d have noticed aldready. But that’s not how usually our body works, and such somehow “indirect”* psychosomatic effects in particular. We might as well be walking through corpses in 30+ years. Or just consider “large” something heavily affecting the 1000s, not necessarily the millions.
    
    The dimensionless form is an important point, thanks for the detailing.
    
    *For one kids in western cities nowadays have more limited motoric abilities, which has a plethora of physical consequences (and cognitive too, since feedback loops are being broken): digestive, hormonal&metabolic, muscular, sensoric…
    
    Reply ↓
BenK on May 6, 2014 9:58 AM at 9:58 am said:

As someone participating in the Good Judgement Project – an exercise in priors, if there ever was one – I’ve been thinking for the past three or four years about scaling expectations. I’m starting to get better at it.

I believe that we need to use log scales to express our expectations. The 1-100% scale used in the GJP is simply not enough dynamic range. There is almost no case where a prediction of between 20-80% is a useful expression of the prior.
Instead, if it isn’t simply a coin toss, then one starts to express priors as 10%, 1%, 0.1%, etc. Perhaps a great predictor can use base 5 instead of 10.

As an aside – checklists are more valuable than almost anyone thinks because people are highly distractable. Without a doubt there is some saturation density at which checklists stop helping because nothing else gets done but the checklists. Avoiding that extreme is not a problem for hospitals at this point. If someone went into a situation that had measurable medical outcomes at stake and introduced a checklist where there had not been one, my prior would be 90% certainty that the checklist would improve outcomes over the course of a month or year. If you want to talk about this more, the person to read is Paul Levy at Not Running a Hospital.

Reply ↓
Anonymous on May 6, 2014 10:20 AM at 10:20 am said:

Andrew:

Are you saying all priors ought to be internally consistent with one another? I see the beauty of that but in practice we humans work with a very coarse scale, in increments of 0.1 say. This includes a lot of error.

For example 1/10 could be anything from 0 to .15 say. Accordingly the prior on checklists could be many times more likely than the one on Facebook.

Put differently, the ideal ratio is included in the coarsened set. This ought to provide _some_ solace.

Reply ↓
Phil on May 6, 2014 11:56 AM at 11:56 am said:

The first example — do checklists save lives — is a great example of the scale problem. Taken literally, saying there’s only a 1/4 chance that checklists save lives means there’s a 3/4 chance that they cost lives. (There’s obviously no way the effect could be exactly 0.00000000000000000…). As you (Andrew) point out, they _could_ cost lives, by taking time and attention away from patient care, but saying there’s a 3/4 chance that they’re bad…there’s just no way. But once you modify the question to be: “do they save a substantial number of lives” or “do they save enough lives to be worthwhile” then that measurement scale becomes intertwined with the rest of the question. If I ask ‘if every hospital used checklists, what’s the chance they would save at least N lives per year’, you can get a very wide range of answers by varying N. Which is why I think that when discussing continuous outcomes (or nearly continuous ones) like the number or fraction of lives saved, it is a mistake to start with probabilities of discrete outcomes and try to get a prior that way. Embrace the continuousness of it and draw a prior distribution on the substantively relevant parameters. It can be a somewhat sobering experience, inasmuch as one is forced to recognize one’s ignorance, or at least to grapple with it. This issue has been discussed previously on this blog, probably more than once, but the time that comes to mind is in a discussion of climate sensitivity to greenhouse gases.

Reply ↓
Nick on May 6, 2014 12:34 PM at 12:34 pm said:

There’s a very simple reason why we won’t all be using Bayesian statistics any time soon: it would require us to drop the pretence that we’re doing neutral science in an objective way, and admit in public that we all have our pet theories. Of course, everyone knows this is true today, but (especially in psychology) we skirt around it with a whole variety of defence mechanisms that would have astonished Freud. I can imagine someone being snubbed at conferences by Blargle (and Blargle’s supporters) because they only assigned Blargle’s Theory of Blerg a prior of 0.2 in a recent article, compared to Klargle’s (obviously useless) Theory of Klerg, to which they assigned a prior of 0.4 last year, in an obvious example of favoritism based on alma mater/funding agency/whatever.

Wagenmakers et al. came out and said it when they replied to Bem (2011), giving a prior of .00000000000000000001 (give or take a zero). But of course, by the time you get to that kind of numbers, diplomatic relations are badly broken (although they may at least have been replaced by a form of mutual understanding, cf. the US and the Soviet Union) and nobody’s really trying.

Reply ↓
- konrad on May 6, 2014 2:09 PM at 2:09 pm said:
  
  That would be a reasonable critique of _subjective_ Bayesianism, which is a point of view supported by roughly zero followers of this blog. In fact, opposition to it is a major theme here.
  
  If you don’t know what I’m referring to, this might be a suitable starting point: http://statmodeling.stat.columbia.edu/2013/02/07/philosophy-and-the-practice-of-bayesian-statistics-with-discussion/
  
  Reply ↓
Jeff Leek on May 6, 2014 2:05 PM at 2:05 pm said:

You are late to the criticism on this post Andrew :-). Here are a few of my favorites:

https://ksj.mit.edu/tracker/2014/03/nate-silvers-new-fivethirtyeight-dishes
http://www.statschat.org.nz/2014/03/18/your-gut-instinct-needs-a-balanced-diet/
https://twitter.com/hildabast/status/445699741830365184

You are right that this was a dramatically simplified version of the approach for a general audience with space constraints. That being said, I think I learned the hard way the problem of subjective priors :-).

Reply ↓
- Andrew on May 6, 2014 2:41 PM at 2:41 pm said:
  
  Jeff:
  
  I wrote the post awhile ago, did not see those other comments. We’re on a 1 or 2 month delay here! Also, just in case it wasn’t clear in my post above, I like the idea of assigning numerical priors, as it’s a way of forcing us to come face to face with our assumptions.
  
  Reply ↓
Brad Stiritz on May 6, 2014 6:19 PM at 6:19 pm said:

Having heard numerous accounts over the years of hospital error & lax standards at lower-tier medical centers (the “crap hospitals” as one top MD in Chicago told me), I would think a well-researched & -designed medical-care checklist has more like a 3/4 probability of significantly reducing mortality / morbidity.

Think about probably one of the most rigorously checklist-driven activities in the world — piloting a commercial passenger aircraft. What were the selection pressures leading to obsessive checklisting? What’s the historical trend in air travel fatalities due to pilot error? Any likely connection there? ;)

One thing I learned from “Thinking Fast & Slow” is that humans often have trouble estimating probabilities. My teenage son got in the habit of spouting off subjective probabilities about this & that, sometimes spectacularly & provably off-the-mark. We decided to start calibrating our judgments by estimating & then measuring our success rate at making trash-basket throws with a mini ball, from a relatively close distance. Simple exercises like this can help put cognitive skills into humbling perspective!

Reply ↓
Erin Jonaitis on May 6, 2014 8:18 PM at 8:18 pm said:

I am fascinated by this scale problem! I wonder whether psychophysical methods could be brought to bear somehow. I seem to remember that being one of the most successful areas of experimental psychology; hard to think of any other results that get referred to as “Laws”… And on that note, log scales (suggested by BenK above) are really interesting to me in that they are both really “natural” (in that sensory systems seem to be multiplicative) but perhaps also difficult to educate people to use.

I feel like the statistical learning people and the risk perception people have probably thought about this issue as well (how to elicit and/or interpret probabilistic knowledge from people).

Reply ↓
- Andrew on May 6, 2014 8:59 PM at 8:59 pm said:
  
  Erin:
  
  Oddly enough, we were just talking about this problem in our research meeting a few hours ago, in the context of setting up prior distributions for regression models to estimate population cells to use in MRP. I think this is a wide-open area and I anticipate that the first few papers that demonstrate a way to do this well, should and will be highly influential.
  
  Reply ↓
Chris G on May 10, 2014 6:46 AM at 6:46 am said:

> My point is not to pick on him but rather to bring some attention to the general problems of probability assignment. It’s easy to say that we think a treatment works or doesn’t work, or that a certain pattern is real or not, but when we start to assign probabilities, I think we need to think more carefully about what are the events we are referring to.

Agreed. Leek’s non-technical description of Bayes’ rule bugged me: “Final opinion on headline = (initial gut feeling) * (study support for headline)”. I think I’d have written something like “educated guess” rather than “gut feeling”. Not all gut feelings are of equal merit. One can have gut feelings about technical matters where they have some experience and are reasonably well informed. (I’d call that an educated guess.) Conversely, one can have gut feelings re matters where they are thoroughly uninformed. (See, for example, the global warming discussion in Andrew’s previous post. Wow.) Is it appropriate to give those gut feelings equal weight?

With that in mind, we talk about uninformative priors. Does anyone refer to “anti-informative” priors? i.e. priors which assign non-zero weight to beliefs that are demonstrably false.

Reply ↓
- Andrew on May 10, 2014 7:30 AM at 7:30 am said:
  
  Chris:
  
  Yes, well put on the “gut feelings” thing. The prior is part of the model and should not be based on “gut feelings” any more than the likelihood is.
  
  Reply ↓
Pingback: ScienceSeeker Editor’s Selections May 11 – 17, 2014 | ScienceSeeker Blog
Pingback: Dimensionless analysis as applied to swimming! « Statistical Modeling, Causal Inference, and Social Science Statistical Modeling, Causal Inference, and Social Science

36 thoughts on “Priors I don’t believe”

Leave a Reply Cancel reply