From 1992. It’s a discussion of a paper by Donoho, Johnstone, Hoch, and Stern. As I summarize:

Under the “nearly black” model, the normal prior is terrible, the entropy prior is better and the exponential prior is slightly better still. (An even better prior distribution for the nearly black model would combine the threshold and regularization ideas by mixing a point mass at 0 with a proper distribution on [0, infinity].) Knowledge that an image is nearly black is strong prior information that is not included in the basic maximum entropy estimate.

Overall I liked the Donoho et al. paper but I was a bit disappointed in their response to me. To be fair, the paper had lots of comments and I guess the authors didn’t have much time to read each one, but still I didn’t think they got my main point, which was that the Bayesian approach was a pretty direct way to get most of the way to their findings. To put it another way, that paper had a lot to offer (and of course those authors followed it up with lots of other hugely influential work) but I think there was value right away in thinking about the different estimates in terms of prior distributions, rather than treating the Bayesian approach as a sort of sidebar.

Some will no doubt think this post is about Astrophysics from the title.

I love that Jaynes [1987] paper you reference. see here: http://bayes.wustl.edu/etj/articles/cchirp.pdf. The derivation of periodogram (and a generalization) from first principles rather than as an intuitive adhoc principle is great.

The paper is pregnant with ideas, only some which have been followed up and published. I know for a fact some were followed up, used heavily, and not published for one reason or another.

It’s also a very philosophical paper, which clearly distinquishes frequentist from Bayesian understanding of probability and shows explicitly what the consequences are for the problems being attacked.

Finally, it’s possible to get problems out of this sort of thing where the noise (likelihood) is about equally important to the prior (or maxent construction in some cases). Leaving either one out gives demonstratedly bad/unusualbe answers and you’re forced to consider both carefully.

Gelman might like that footnote at the bottom of page 10:

“Likewise, in calculating within a given problem, the orthodoxian does not question the correctness of his sampling distribution, and we do not criticize him for this. But it is for him, as for us, only a tentative working hypothesis; having finished the calculation, his unhappiness at the result may lead him to consider a different sampling distribution. The Bayesian has an equal right to do this.”

That footnote was in refernce to a paragraph that Gelman might also have liked:

“Of course, to answer a recent criticism (Tukey, 1984), in setting up a model the Bayesian – like any other theoretician – is only formulating a working hypothesis, to find out what its consequences would be. He has not thereby taken a vow of theological commitment to believe it forever in the face of all new evidence.”

That discussion on page 2-3 about the relation of AR models and the Mittag-Leffler Theorem from complex analysis was a real eye opener as well when I was a young student of time series analysis.

That historical discussion of Oceanographic Chirp was great as well, correcting some insinuations of Tukey’s (Appendix A).

Oceanographic Chirp involves using measured chirped signals to detect storms thousands of miles away.

There is always scope for better priors; in some models of this type like those considered in the Johnstone and Silverman Needles in Haystacks paper an

attractive option is empirical Bayes using the Kiefer-Wolfowitz MLE for the mixture model. There is a very brief sketch of this approach here:

http://onlinelibrary.wiley.com/doi/10.1002/sta4.38/abstract including some simulation comparisons with a variety of Bayesian and non-Bayesian

alternatives.