Over the years I’ve written a dozen or so journal articles that have appeared with discussions, and I’ve participated in many published discussions of others’ articles as well. I get a lot out of these article-discussion-rejoinder packages, in all three of my roles as reader, writer, and discussant. Part 1: The story of an unsuccessful [...]
From an email I received the other day: Things are going much better now — it’s interesting, it feels like with both of my models, parameters are slow to converge or get “stuck” and have trouble mixing when the model is somehow misspecified. See here for a statement of the folk theorem.
Upon reading this note by John Cook on continued fractions, I wrote: If you like continued fractions, I recommend you read the relevant parts of the classic Numerical Methods That Work. The details are probably obsolete but it’s fun reading (at least, if you think that sort of thing is fun to read). I then [...]
Zhiqiang Tan writes: I have created an R package to implement the full likelihood method in Kong et al. (2003). The method can be seen as a binless extension of so-called Weighted Histogram Analysis Method (UWHAM) widely used in physics and chemistry. The method has also been introduced to the physics literature and called the [...]
In response to the latest controversy, a statistics professor writes: It’s somewhat surprising to see Very Serious Researchers (apologies to Paul Krugman) using Excel. Some years ago, I was consulting on a trademark infringement case and was trying (unsuccessfully) to replicate another expert’s regression analysis. It wasn’t until I had the brainstorm to use Excel [...]
Xi’an’s Og (aka Christian Robert’s blog) is featuring a very nice presentation of NUTS by Marco Banterle, with discussion and some suggestions. I’m not even sure how they found Michael Betancourt’s paper on geometric NUTS — I don’t see it on the arXiv yet, or I’d provide a link.
This post is by Phil A recent post on this blog discusses a prominent case of an Excel error leading to substantially wrong results from a statistical analysis. Excel is notorious for this because it is easy to add a row or column of data (or intermediate results) but forget to update equations so that [...]
The Stan Development Team is happy to announce that Stan 1.3.0 and RStan 1.3.0 are available for download. Follow the links on: Stan home page: http://mc-stan.org/ Please let us know if you have problems updating. Here’s the full set of release notes. v1.3.0 (12 April 2013) ====================================================================== Enhancements ———————————- Modeling Language * forward sampling (random [...]
Michael Betancourt will be speaking at Google and at the University of California, Berkeley. The Google talk is closed to outsiders (but if you work at Google, you should go!); the Berkeley talk is open to all: Friday March 22, 12:10 pm, Evans Hall 1011. Title of talk: Stan: Practical Bayesian Inference with Hamiltonian Monte [...]
Someone who wishes to remain anonymous writes:
Felipe Osorio made the above video to help people use the General Social Survey and R to answer research questions in social science. Go for it! Meanwhile, Tom Smith reports: The initial release of the General Social Survey (GSS), cumulative file for 1972-2012 is now on our website. Codebooks and copies of questionnaires will be [...]
Stan 1.2.0 and RStan 1.2.0 are now available for download. See: http://mc-stan.org/ Here are the highlights. Full Mass Matrix Estimation during Warmup Yuanjun Gao, a first-year grad student here at Columbia (!), built a regularized mass-matrix estimator. This helps for posteriors with high correlation among parameters and varying scales. We’re still testing this ourselves, so [...]
Michael Betancourt will be speaking at UCLA: The location for refreshment is in room 51-254 CHS at 3:00 PM. The place for the seminar is at CHS 33-105A at 3:30pm – 4:30pm, Wed 6 Mar. ["CHS" stands for Center for Health Sciences, the building of the UCLA schools of medicine and public health. Here's a [...]
Stan is written in C++ and can be run from the command line and from R. We’d like for Python users to be able to run Stan as well. If anyone is interested in doing this, please let us know and we’d be happy to work with you on it. Stan, like Python, is completely [...]
Following up on our previous post, Andrew Wilson writes: I agree we are in a really exciting time for statistics and machine learning. There has been a lot of talk lately comparing machine learning with statistics. I am curious whether you think there are many fundamental differences between the fields, or just superficial differences — [...]
David Duvenaud writes: I’ve been following your recent discussions about how an AI could do statistics [see also here]. I was especially excited about your suggestion for new statistical methods using “a language-like approach to recursively creating new models from a specified list of distributions and transformations, and an automatic approach to checking model fit.” [...]
Join Dirk Eddelbuettel for six hours of detailed and hands-on instructions and discussions around Rcpp, RInside, RcppArmadillo, RcppGSL and other packages . . . Rcpp has become the most widely-used language extension for R. Currently deployed by 103 CRAN packages and a further 10 BioConductor packages, it permits users and developers to pass “whole R [...]
Tiago Fragoso writes: Suppose I fit a two stage regression model Y = a + bx + e a = cw + d + e1 I could fit it all in one step by using MCMC for example (my model is more complicated than that, so I’ll have to do it by MCMC). However, I [...]
Burak Bayramli writes: I wanted to inform you on iPython Notebook technology – allowing markup, Python code to reside in one document. Someone ported one of your examples from ARM. iPynb file is actually a live document, can be downloaded and reran locally, hence change of code on document means change of images, results. Graphs [...]
We just released Stan 1.1.1 and RStan 1.1.1 As usual, you can find download and install instructions at: http://mc-stan.org/ This is a patch release and is fully backward compatible with Stan and RStan 1.1.0. The main thing you should notice is that the multivariate models should be much faster and all the bugs reported for [...]
Too many MC’s not enough MIC’s, or What principles should govern attempts to summarize bivariate associations in large multivariate datasets?
Justin Kinney writes: Since your blog has discussed the “maximal information coefficient” (MIC) of Reshef et al., I figured you might want to see the critique that Gurinder Atwal and I have posted. In short, Reshef et al.’s central claim that MIC is “equitable” is incorrect. We [Kinney and Atwal] offer mathematical proof that the [...]
Sharad Goel, Jake Hofman, and Sergei Vassilvitskii are teaching this awesome class on computational social science this semester in the applied math department at Columbia. Here’s the course info. You should take this course. These guys are amazing.
Richard Morey writes: You and your blog readers may be interested to know that a we’ve released a major new version of the BayesFactor package to CRAN. The package computes Bayes factors for linear mixed models and regression models. Of course, I’m aware you don’t like point-null model comparisons, but the package does more than [...]
We had a recent discussion about statistics packages where people talked about the structure and capabilities of different computer languages. One thing I wanted to add to this discussion is some sociology. To me, a statistics package is not just its code, it’s also its community, it’s what people do with it. R, for example, [...]
Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students [...]