Bill Harris has a fun little calculation of a conditional probability using three different data sources. Could be a good example for teaching intro probability or basic Bayesian inference.

Posted by Andrew on 18 December 2006, 1:03 am

Bill Harris has a fun little calculation of a conditional probability using three different data sources. Could be a good example for teaching intro probability or basic Bayesian inference.

## Recent Comments

- Christian Bartels on Bayesians are frequentists
- Shravan on Data science teaching position in London
- Martha (Smith) on Power analysis and NIH-style statistical practice: What’s the implicit model?
- Seth on Data science teaching position in London
- MIFA on “Human life is unlimited – but short”
- Sameera Daniels on Can somebody please untangle this one for us? Are centrists more, or less, supportive of democracy, compared to political extremists?
- Quite Likely on Can somebody please untangle this one for us? Are centrists more, or less, supportive of democracy, compared to political extremists?
- Ben Goodrich on Bayesians are frequentists
- Jan de Ruiter on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- Ali Tasso on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- Seamus Power on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- Sameera Daniels on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- yyw on Power analysis and NIH-style statistical practice: What’s the implicit model?
- Peter Dorman on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- Thanatos Savehn on Power analysis and NIH-style statistical practice: What’s the implicit model?
- yyw on Power analysis and NIH-style statistical practice: What’s the implicit model?
- El Gordo on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- Bill Harris on What is the role of qualitative methods in addressing issues of replicability, reproducibility, and rigor?
- Carlos Ungil on Power analysis and NIH-style statistical practice: What’s the implicit model?
- Keith O'Rourke on Power analysis and NIH-style statistical practice: What’s the implicit model?

## Categories

I was quite impressed with Eliezer Yudkowsky's mini-course at http://yudkowsky.net/bayes/bayes.html

I'd prefer if Yudkowsky's page were called "Bayesian inference for binary inference." It's fine, but it doesn't capture most of what I see as applied Bayesian inference; see here.

Posted something about this earlier re: Validation of Software by Cook et al.

But if you “encode” the joint distribution from Yudkowsky's first example as a “data set”

“R” code

> datajoint

Keith,

Could you expand on that "Bayes theorem (a.k.a. Nearest Neighbors) " comment? That's a spin on Bayes theorem I've not seen before.

Thanks

Bill

It’s just a rose by another name does not smell as sweet comment. Some would argue that Bayesian analysis simply involves a joint distribution of parameters θ (the unknowns) and observations x (random variables drawn from probability distribution with given unknown values of θ). With the joint distribution of (x, θ), a basic “axiom of inference” then says that probability statements about θ should be based on the conditional distribution of θ given the data observed xO, otherwise known as the posterior distribution of θ and here denoted by the posterior distribution π(θ | xO). This is a particular application of conditional probability in a two-stage system where we observe the outcome from the second stage (the data xO) and want to make statements about the concealed outcome from the first stage (the unknown parameters θ). This application is commonly referred to as Bayes theorem. This application of conditional probability could be viewed as the key step in what distinguishes a Bayesian analysis. For instance see Optimality and computations for relative surprise inferences. M. Evans, I.Guttman and T. Swartz, Canadian Journal of Statistics, Vol. 34, No. 1, 2006, 0 113-129. Now if the joint probability is specified exactly or approximately as a data set this conditioning step is the same as doing Nearest Neighbors. Now will this renaming make Bayes Theorem less mysterious for some? Would it be worthwhile to introduce students to Bayesian statistics with some minimalistic examples where joint distributions could be coded as data sets? (Now as Andrew pointed this is only one step of an applied Bayesian analysis) Keith