Skip to content

mysterious shiny things

(Disclaimer: I’m new to Shiny, and blog posts, but I know something about geography.)  In the Shiny gallery, take a look at 2001 versus 2002. Something funny happens to Switzerland (and other European countries), in terms of the legend, it moves from Europe to the Middle East. Also, the legend color scheme switches.





To reproduce it yourself: download ui.R, server.R, and healthexp.Rds

Have a folder called “App-Health-Exp” in your working directory, with ui.R and server.R in the “App-Health-Exp” folder. Have the dataset healthexp.Rds in your working directory.
Then run this code:

if (!require(devtools))



data = readRDS("healthexp.Rds")

# Problem isn't the data, it seems that Switzerland is in Europe 
# in both 2001 and 2002:
data[data$Year == 2001 & data$Country == "Switzerland",]
data[data$Year == 2002 & data$Country == "Switzerland",]


Anyone know what is happening?

Bayesian Cognitive Modeling  Examples Ported to Stan

Bayesian Cognitive Modeling book cover

There’s a new intro to Bayes in town.

This book’s a wonderful introduction to applied Bayesian modeling. But don’t take my word for it — you can download and read the first two parts of the book (hundreds of pages including the bibliography) for free from the book’s home page (linked in the citation above). One of my favorite parts of the book is the collection of interesting and instructive example models coded in BUGS and JAGS (also available from the home page). As a computer scientist, I prefer reading code to narrative!

In both spirit and form, the book’s similar to Lunn, Jackson, Best, Thomas, and Spiegelhalter’s BUGS Book, which wraps their seminal set of example models up in textbook form. It’s also similar in spirit to Kruschke’s Doing Bayesian Data Analysis, especially in its focus on applied cognitive psychology examples.

Bayesian Cognitive Modeling Examples Now in Stan!

One of Lee and Wagenmaker’s colleagues, Martin Šmíra, has been porting the example models to Stan and the first batch is already available in the new Stan example model repository (hosted on GitHub):

Many of the models involve discrete parameters in the BUGS formulation which need to be marginalized out in the Stan models. The Stan 2.5 manual is adding a whole new chapter with some non-trivial marginalizations (change point models, CJS mark-recapture models, and categorical diagnostic accuracy models).

Expect the rest soon! And feel free to jump on the Stan users group to discuss the models and how they’ve been coded.

Warning: The models are embedded as strings in R code. We’re looking for a volunteer to pull the models out of the R code and generate data for them in a standalone file that could be used in PyStan or CmdStan.

Your Models Next?

If you’d like to contribute Stan models to our example repo, the README at the bottom of the front page of the GitHub repository linked above contains information on what we’d like to get. We only need open-source distribution rights — authors retain copyright for all their work on Stan. Contact us either via e-mail or via the Stan users group.

One-tailed or two-tailed


This image of a two-tailed lizard (from here, I can’t find the name of the person who took the picture) never fails to amuse me.

But let us get to the question at hand . . .

Richard Rasiej writes:

I’m currently teaching a summer session course in Elementary Statistics. The text that I was given to use is Triola’s Elementary Statistics, 12th ed.

Let me quote a problem on inference from two proportions:

11. Is Echinacea Effective for Colds? Rhino viruses typically cause common colds. In a test of the effectiveness of echinacea, 40 of the 45 subjects treated with echinacea developed rhinovirus infections. In a placebo group, 88 of the 103 subjects developed rhinovirus infections (based on data from “An Evaluation of Echinacea Angustifolia in Experimental Rhinovirus Infections,” by Turner et. al., New England Journal of Medicine, Vol. 353, No. 4). We want to use a 0.05 significance level to test the claim that echinacea has an effect on rhinovirus infection.

The answer in the back of the teacher’s edition sets up the hypothesis test as H0: p1 = p2, H1: p1 <> (not equal to) p2, gives a test statistic of z = 0.57, uses critical values of +/- 1.96, and gives a P-value of .5686.

I was having a hard time explaining the rationale for the book’s approach to my students. My thinking was that since there is no point in claiming that echinacea has an effect on the common cold unless you think it helps, we should be doing a one-tailed test with H0: p1 = p2, H1: p1 < p2. We would still fail to reject the null hypothesis, but with a P-value of .2843.

Or, is what I am missing that, if you are testing the claim that something has an effect you want to also test the possibility that the effect is the opposite of what you’d normally want (e.g. this herb is bad for you, or inhaling smoke is good for you, etc.)?

Any advice you could give me on how best to parse this problem for my students would be greatly appreciated. I already feel very nervous stating, in effect, “well, that’s not the way I would do it.”

My reply:

The quick answer is that maybe echinacea is bad for you! Really though the example is pretty silly, as one can simply compare 40/45 and 88/103 and look at the sampling variability of the proportions. I don’t see that the hypothesis test and p-value add anything.

This doesn’t sound like much, but, amazingly enough, Rasiej replied later that day:

I guess I was led astray by the lead-in to the problem, which seemed to imply that there was a benefit. Obviously it’s better to read the claim carefully and take it literally. So, “test the claim that echinacea has an effect” is two-tailed since ANY effect, beneficial or not, would be significant.

That said, I do agree with you that the example is silly, given the data in the problem.

Thanks again for your insights. They helped in my class today.

Perhaps (maybe I should say “probably”) he was just being polite, but I prefer to think that even a brief reply can convey some useful understanding. Also I think it’s a good general message to take what people say literally. This is not a message that David Brooks likes to hear, I think, but it is, to me, an essential aspect of statistical thinking.

P.S. Perhaps I should stress that in my response above I wasn’t saying that confidence intervals are some kind of wonderful automatic replacement for p-values. I was just saying that, in this particular case, it seems to me that you’d want a summary of the information provided by the experiment, and that this summary is best provided by the estimated proportions and their standard errors. To set if up in a p-value context would seem to imply that you’re planning on making a decision about echinacea based on this single experiment, but that wouldn’t make sense at all! No need to jump the gun and go all the way to a decision statement; it seems enough to just summarize the information in the data.

“It’s as if you went into a bathroom in a bar and saw a guy pissing on his shoes, and instead of thinking he has some problem with his aim, you suppose he has a positive utility for getting his shoes wet”

The notion of a geocentric universe has come under criticism from Copernican astronomy. . . .


A couple months ago in a discussion of differences between econometrics and statistics, I alluded to the well-known fact that everyday uncertainty aversion can’t be explained by a declining marginal utility of money.

What really bothers me—it’s been bothering me for decades now—is that this is a simple fact that “everybody knows” (indeed, in comments some people asked why I was making such a big deal about this triviality), but, even so, it remains standard practice within economics to use this declining-marginal-utility explanation.

I don’t have any econ textbooks handy but here’s something from the Wikipedia entry for risk aversion:

Risk aversion is the reluctance of a person to accept a bargain with an uncertain payoff rather than another bargain with a more certain, but possibly lower, expected payoff.

OK so far. And now for their example:

A person is given the choice between two scenarios, one with a guaranteed payoff and one without. In the guaranteed scenario, the person receives $50. In the uncertain scenario, a coin is flipped to decide whether the person receives $100 or nothing. The expected payoff for both scenarios is $50, meaning that an individual who was insensitive to risk would not care whether they took the guaranteed payment or the gamble. However, individuals may have different risk attitudes. A person is said to be:

risk-averse (or risk-avoiding) – if he or she would accept a certain payment (certainty equivalent) of less than $50 (for example, $40), rather than taking the gamble and possibly receiving nothing. . . .

They follow up by defining risk aversion in terms of the utility of money:

The expected utility of the above bet (with a 50% chance of receiving 100 and a 50% chance of receiving 0) is,
and if the person has the utility function with u(0)=0, u(40)=5, and u(100)=10 then the expected utility of the bet equals 5, which is the same as the known utility of the amount 40. Hence the certainty equivalent is 40.

But this is just wrong. It’s not mathematically wrong but it’s wrong in any practical sense, in that a utility function that curves this way between 0 and 100 can’t possibly make any real-world sense.

Way down on the page there’s one paragraph saying that this model has “come under criticism from behavioral economics.”

But this completely misses the point!

It would be as if you went to the Wikipedia entry on planetary orbits and saw a long and involved discussion of the Ptolemaic model, with much discussion of the modern theory of epicycles (image above from Wikipedia, taken from the Astronomy article in the first edition of the Enyclopaedia Brittanica), and then, way down on the page, a paragraph saying something like,

The notion of a geocentric universe has come under criticism from Copernican astronomy.

Again, this is frustrating because it’s so simple, it’s so obvious that any utility function that curves so much between 0 and 100 can’t keep going forward in any reasonable sense.

It’s an example I used to give as a class-participation activity in my undergraduate decision analysis class and which I wrote up a few years later in an article on classroom demonstrations.

I’m not claiming any special originality for this result. As I wrote in my recent post,

The general principle has been well-known forever, I’m sure.

Indeed, unbeknownst to me, Matt Rabin published a paper a couple years later with a more formal treatment of the same topic, and I don’t recall ever talking with him about the problem (nor was it covered in Mr. Cutlip’s economics class in 11th grade), so I assume he figured it out on his own. (It would be hard for me to imagine someone thinking hard about curving utility functions and not realizing they can’t explain everyday risk aversion.)

In response, commenter Megan agreed with me on the substance but wrote:

I am sure it has NOT been well-known forever. It’s only been known for 26 years and no one really understands it yet.

I’m pretty sure the Swedish philosopher who proved the mathematical phenomenon 10 years before you and 12 years before Matt Rabin was the first to identify it. The Hansson (1988)/Gelman (1998)/Rabin (2000) paradox is up there with Ellsberg (1961), Samuelson (1963) and Allais (1953).

Not so obvious after all?

Megain’s comment got me thinking: maybe this problem with using a nonlinear utility function for money is not so inherently obvious. Sure, it was obvious to me in 1992 or so when I was teaching decision analysis, but I was a product of my time. Had I taught the course in 1983, maybe the idea wouldn’t have come to me at all.

Let me retrace my thoughts, as best as I can now recall them. What I’d really like is a copy of my lecture notes from 1992 or 1994 or whenever it was that I first used the example, to see how it came up. But I can’t locate these notes right now. As I recall, I taught the first part of my decision analysis class using standard utility theory, first having students solve basic expected-monetary-value optimization problems and then going through the derivation of the utility function given the utility axioms. Then I talked about violations of the axioms and went on from there.

It was a fun course and I taught it several times, at Berkeley and at Columbia. Actually, the first time I taught the subject it was something of an accident. Berkeley had an undergraduate course on Bayesian statistics that David Blackwell had formerly taught. He had retired so they asked me to teach it. But I wasn’t comfortable teaching Bayesian statistics at the undergraduate level—this was before Stan and it seemed to me it would take the students all semester just to get up to speed on the math, with on time to do anything interesting—so I decided to teach decision analysis instead. using the same course number. One particular year I remember—I think it was 1994—when we had a really fun bunch of undergrad stat majors, and a whole bunch of them were in the course. A truly charming bunch of students.

Anyway, when designing the course I read through a bunch of textbooks on decision analysis, and the nonlinear utility function for money always came up as the first step beyond “expected monetary value.” After that came utility of multidimensional assets (the famous example of the value of a washer and a dryer, compared to two washers or two dryers), but the nonlinear utility for money, used sometimes to define risk aversion, came first.

But the authors of many of these books were also aware of the Kahneman, Slovic, and Tversky revolution. There was a ferment, but it still seemed like utility theory was tweakable and that the “heuristics and biases” research merely reflected a difficulty in measuring the relevant subjective probabilities and utilities. It was only a few years later that a book came out with the beautifully on-target title, “The Construction of Preference.”

Anyway, here’s the point. Maybe the problem with utility theory in this context was obvious to Hansson, and to me, and to Yitzhak, because we’d been primed by reading the work by Kahneman, Slovic, Tversky, and others exploring the failures of the utility model in practice. In retrospect, that work too should not have been a surprise—-after all, utility theory was at that time already a half-century old and it had been developed in the behavioristic tradition of psychology, predating the cognitive revolution of the 1950s.

I can’t really say, but it does seem that sometimes the time is ripe for an idea, and maybe this particular idea only seemed so trivial to me because it was already accepted that utility theory had problems modeling preferences. Once you accept the empirical problem, it’s not so hard to imagine there’s a theoretical problem too.

And, make no doubt about it, the problem is both empirical and theoretical. You don’t need any experimental data at all to see the problem here:

Screen Shot 2014-07-21 at 4.16.23 PM

Screen Shot 2014-07-21 at 4.16.35 PM

Also, let me emphasize that the solution to the problem is not to say that people’s preferences are correct and so the utility model is wrong. Rather, in this example I find utility theory to be useful in demonstrating why the sort of everyday risk aversion exhibited by typical students (and survey respondents) does not make financial sense. Utility theory is an excellent normative model here.

Which is why it seems particularly silly to be defining these preferences in terms of a nonlinear utility curve that could never be.

It’s as if you went into a bathroom in a bar and saw a guy pissing on his shoes, and instead of thinking he has some problem with his aim, you suppose he has a positive utility for getting his shoes wet.

Suspiciously vague graph purporting to show “percentage of slaves or serfs in the world”

Phillip Middleton sent this along, it’s from Peter Diamandis, who is best known for his X Prize, the “global leader in the creation of incentivized prize competitions.” Diamandis wrote:

Phillip Middleton,

Is technology making you work harder? Or giving you more time off?

Seriously, it feels like it’s enabling me to work around the clock! Heck, I’m writing this email at 37,000 feet on a Virgin America flight from DC to LA at 11 p.m. ET.

So that being said, I want to share the actual DATA with you about Work vs. Leisure. . . .

It’s easy to forget that for centuries — for millenia — the “workforce” was ALL of us.

A few people lived in luxury, but the vast majority were slaves and serfs who did the work. In 1750, 75 percent of people on the planet worked to support the top 25 percent.

Let’s look at the numbers. It’s extraordinary how this has changed over time.


You’ll notice that by 2000, the global percentage of slaves and serfs in the world is down to 10 percent. As artificial intelligence and robotics come online, this number is going to drop down to zero.

Hey, if only artificial intelligence and robotics had existed in 1863, then Lincoln could’ve freed the—whaaaaa? What’s with that graph, anyway? Let’s look at the data, indeed. That curve looks suspiciously smooth!

Where did “the numbers” come from? The source says “Simon, pp. 171-177″ but that’s not quite enough information. Luckily, we make rapid progress via Google. A search on “percentage of slaves or serfs in the world” takes us to this 2001 book by Stephen Moore and Julian Simon and the following quote:

A larger percentage of the world’s inhabitants are freer than ever before in history. Economic historian Stanley Engerman has noted that as recently as the late 18th century, “The bulk of mankind, over 95 percent, were miserable slaves or [sic] despotic tyrants.” . . . The figure shows the decline of slavery from 1750 through the end of the 20th century.

Screen Shot 2014-07-16 at 4.00.18 PM

This one’s kinda weird because they put 1917 exactly halfway between 1750 and 2000, which isn’t quite right. It’s almost like they just drew a curve freehand through some made-up numbers! Also a bit odd is that Moore and Simon’s curve is not consistent with their own citation: in their text, they say the proportion of slaves in the late 18th century was 95%, but in the graph it’s around 70%.

The next step, I suppose, is to track down “Simon, pp. 171-77; and authors’ calculations.” But I’m getting tired. Maybe someone else could follow this up for me?

In summary, the graph looks bogus to me. Some of these tech zillionaires seem to have no B.S. filter at all! Perhaps to be successful in that area it helps to be a bit credulous?

P.S. From comments below it seems clear that this graph has been created from a few nonexistent data points. It’s pretty horrible that Diamandis labeled this as “actual DATA.” I guess that’s just further confirmation that when people shout in ALL CAPS, they don’t know what they’re talking about!

My talk at the Simons Foundation this Wed 5pm

Anti-Abortion Democrats, Jimmy Carter Republicans, and the Missing Leap Day Babies: Living with Uncertainty but Still Learning

To learn about the human world, we should accept uncertainty and embrace variation. We illustrate this concept with various examples from our recent research (the above examples are with Yair Ghitza and Aki Vehtari) and discuss more generally how statistical methods can help or hinder the scientific process.

My talk with David Schiminovich this Wed noon: “The Birth of the Universe and the Fate of the Earth: One Trillion UV Photons Meet Stan”

This talk will have two parts. (1) Astronomy professor David Schiminovich will discuss the ways in which recent large-scale sky surveys that include billions of data points can address questions such as, What will happen to the Earth and other planets when the Sun becomes a white dwarf? (2) Statistics professor Andrew Gelman will discuss some open research questions involved with Stan, an open-source C++ program that performs Bayesian inference using state-of-the-art methods in statistics and computing. Schiminovich and Gelman will discuss how we plan to develop scalable computing ideas in Stan to fit big models to big data in astronomy. This is research suitable for statistics Ph.D. theses, and we are looking for one or more Ph.D. students in statistics to work on this and related projects on scalable modeling and computing.

Wed 10 Sept 2014, 12-1pm in the Statistics Department large seminar room (Social Work Bldg room 903, Columbia University).

On deck this week

Mon: My talk with David Schiminovich this Wed noon: “The Birth of the Universe and the Fate of the Earth: One Trillion UV Photons Meet Stan”

Tues: Suspiciously vague graph purporting to show “percentage of slaves or serfs in the world”

Wed: “It’s as if you went into a bathroom in a bar and saw a guy pissing on his shoes, and instead of thinking he has some problem with his aim, you suppose he has a positive utility for getting his shoes wet”

Thurs: One-tailed or two-tailed

Fri: What is the purpose of a poem?

Sat: He just ordered a translation from Diederik Stapel

Sun: Six quotes from Kaiser Fung

Likelihood from quantiles?

Michael McLaughlin writes:

Many observers, esp. engineers, have a tendency to record their observations as {quantile, CDF} pairs, e.g.,

x CDF(x)

3.2 0.26
4.7 0.39


I suspect that their intent is to do some kind of “least-squares” analysis by computing theoretical CDFs from a model, e.g. Gamma(a, b), then regressing the observed CDFs against the theoretical quantiles, iterating the model parameters to minimize something, perhaps the K-S statistic.

I was wondering whether standard MCMC methods would be invalidated if the likelihood factor were constructed using CDFs instead of PDFs (or density mass). That is, the likelihood would be the product of F(x) values instead of the derivative, f(x). My intuition tells me that it shouldn’t matter since the result is still a product of probabilities but the apparent lack of literature examples gives me pause.

My reply: I don’t know enough about this sort of problem to give you a real answer, but in general the likelihood is the probability distribution of the data (given parameters), hence in setting up the likelihood you want to get a sense of what the measurements actually are. Is that “3.2″ measured with error, or are you concerned with variation across different machines or whatever? Once you know this, maybe you can model the measurements directly.

Some time in the past 200 years the neighborhood has changed

“In that pleasant district of Merry England which is watered by the river Don, there extended in ancient times a large forest, covering the greater part of the beautiful hills and valleys which lie between Sheffield and the pleasant town of Doncaster.  The remains of this extensive wood are still to be seen at the noble seats of Wentworth, of Wharncliffe Park, and around Rotherham.”