David Budescu writes,

We ran an experiment where subject made predictions about future value of many stocks based on their past performance. More precisely, they were asked to estimate 7 quantiles of the distribution of each stock:

Q05, Q15, Q25, Q50, Q75, Q85, and Q95

I would like to estimate the mean and SD (or variance) of this distribution based on these quantiles subject to weak assumptions (symmetry and unimodality) but without assuming a particular distribution.

I know of some methods (e.g. Pearson & Tukey, Biometrika, 1965) that use only 3 of these quantiles (Q05, Q50, and Q95) but I hate not to use all the data I have collected.

Does anyone know of a more general and flexible solution?

Any thoughts? Of course, some distribution would have to be assumed. Also, I wonder about assuming symmetry since the data would be there to reject the hypothesis of symmetry in some settings. Also, of course, I wonder whether the mean and sd are really what you want. Well, I can see the mean, since it’s $, but I’m not so sure that the sd is what’s wanted.

A colleague writes,

You can do this without assuming any distribution as follows: find the maximum entropy distribution subject to the constraints that the 7 quintiles of the distribution have to agree with the 7 quintiles specified. Once the maximum entropy distribution is found, simply compute it's mean and variance. Finding the maximum entropy distribution subject to those constraints is not hard. In fact, I think it can be done analytically (although it is messy).

Joseph:

I'm trying to do max entropy in my head, so forgive me if I'm asking a stupid question.

Stock prices have a natural minimum of zero, so you would get a uniform distribution between 0 and Q05 (with total probability of 0.05, of course), uniform between Q05 and Q15, etc. Am I getting this right?

But at the end we only know that the interval between Q95 and infinity has 5% probability; how would you distribute this? I can imagine using ad hoc methods (some sort of exponential), but I don't know how to do it using maxent.

I thought about it some more and of course you are right. I can think of two ways of dealing with the problem. The first one is to use an upper limit. Stock Prices are usually less than $400. So take an upper limit of $1000 and see if the final answers are sensitive to changes to it. The other suggestion would be to use the Kulback-Leibler distance with a reference distribution which decays fast enough at infinity to keep everything finite.

Both of these methods will require setting an effective upper limit to price of the stock. Maybe that is something you will have to consider no matter what method you use.

Hmmm, quantiles, non-negative values, single mode…. this sounds like a job for the Weibull distribution. It ain't symmetric, but it would come close. In a modified form, this is a great problem for my undergrads–thanks!

Charles Manski is the king of nonparametric estimation over subjective expectations data.

I would argue that for any application in which the mean and variance are sufficient statistics to describe people's risk preferences over stock prices, you're making an implict lognormal assumption anyway, so just bite the bullet and do it parametrically. Manski does cite a few methods of doing so without making parametric assumptions though.

Ook, here's a Manski piece specifically on equity expectations. He's asked for people to specify the quantile at specific cash amounts rather than give the cash amount at a quantile, but the fitting problem is going to be more or less the same.