Stan Down Under

I (Bob, not Andrew) am in Australia until April 30. I’ll be giving some Stan-related and some data annotation talks, several of which have yet to be concretely scheduled. I’ll keep this page updated with what I’ll be up to. All of the talks other than summer school will be open to the public (the meetups will probably require registration).

Sydney (now — 31 March)
  • 19 February: Bayesian Inference and MCMC. Machine Learning Summer School (NICTA).
  • 20 February: A Practical Introduction to Stan. Machine Learning Summer School (NICTA).
  • 4 March, 11 AM — Noon. The Benefits of a Probabilistic Model of Data Annotation. Macquarie Uni Computer Science Dept.
  • 5 March: Stan: A Probabilistic Programming Language. NICTA Sydney (broadcast to other NICTA offices)
  • 10 March, 2–3 PM. Stan: A Probabilistic Programming Language. Macquarie Uni Statistics Dept. Building E4A, room 523.
  • 11 March, 6 PM, Mitchell Theatre, Level 1 at SMSA (Sydney Mechanics’ School of Arts) : RStan: Bayesian Inference Made Easy. Register Now: Sydney Users of R Forum (SURF) (Meetup)
  • 24 March, 1-5 PM RStan Hands-on Workshop. Statistical Society of Australia, NSW Branch.
  • 30 March, 11 AM. The Benefits of a Probabilistic Model of Data Annotation. U. Sydney computer science.
  • 30 March, 2 PM. Stan: A Probabilistic Programming Language. U. Sydney statistics.
Melbourne (1 April — 30 April)
  • 9 April, 1 PM: Stan: Bayesian Inference Made Easy. Burwood Corporate Centre, Level 2 Building BC, Deakin University, 221 Burwood Highway, Burwood. Room details available on the day by asking at reception.
  • 14 April, 12 Noon: A Probabilistic Model of Data Annotation. Uni of Melbourne, Computer Science.
  • 28 April, TBD: Probabilistic Modelling and Inference with Stan. Statistical Society of Australia, Victorian Branch.

I’m also happy to get together one-on-one with people to discuss computation or modeling. If you’d like to get together, please e-mail me at [email protected].

Data Annotation Talk

The data annotation talks will be based largely on a a paper with Becky Passonneau, where we use Mechanical Turk generated labels for the dictionary meaning intended by thousands of word instances in context; the data’s all open access as is the code to reproduce the paper.

I’ll extend the paper by talking about Bayesian approaches to hierarchical modeling (pushback from earlier referees led to us using penalized MLE in the paper), jointly estimating a model and gold standard, modeling item annotation difficulty, and a few other things, including a philosophical discussion of whether the truth is really out there. This is the modeling problem that made me realize I needed to learn Bayesian stats properly and led to my working with Andrew in the first place. He and Jennifer Hill helped me develop a basic latent gold-standard multinomial model to adjust for annotator inaccuracy and bias, though it turns out I was scooped by (Dawid and Skene 1979).

Stan Talks

The Stan talks are, of course, based on Stan and RStan.

Stan SEO Down Under

I just searched [stan] on Google from a fresh sim card in Sydney on google.com.au.

Good news: We’re ahead of Eminem.

Bad news: We’re behind the streaming media service (the Netflix of Australia). So far behind, we’re not even on the first page of hits (we were the first hit on page 2).

What happened? I thought Google had added results diversity so, for example, you don’t get 1000 hits for Michael Jordan the basketball player before the first hit for Michael Jordan the computer scientist (or maybe vice-versa these days?). Here’s the seminal paper on results diversity from my old colleague Jaime Carbonell and crew from way back in 1998. The basic idea is your next result should balance similarity to the query and difference from earlier results.

Of course, we could’ve followed Hadley’s adviced and called our system McMcWalla2 or something like that.

You can help boost our rankings — just link to mc-stan.org from your .au web page!