Showdown in Vegas: When the numbers differ in the third decimal place

Posted on December 30, 2015 9:22 AM by Andrew

From the Stan users list:

I have just started to look into the output of the optimizing function and it seems to give estimates slightly different than the ones that I had previously obtained through maximum likelihood estimation (using MATLAB). Can you please tell me what is the penatly that the LBFGS algorithm imposes?

In addition, is there any way to perform “optimizing” for maximum likelihood estimation and not penalized maximum likelihood estimation?

The second question was easy to answer: When set to optimize, Stan maximizes whatever function is given. If you give it the log-likelihood, Stan will give you the maximum likelihood estimate (or, more precisely, a local mode, or perhaps multiple modes if you run many chains). If you give it a penalized log-likelihood, Stan will give you the penalized maximum likelihood estimate.

The first question was more interesting and provoked some discussion of the possibility that the differences in the algorithms came from issues of machine precision.

Later the person posted an update:

The error was very small, but bigger than what the tolerance could account for, but I have just noticed that I had indeed a small difference between models.

I am getting exactly the same values now.

This sort of thing has happened to me many times. I do some calculation in two different ways and the answers differ in the third decimal place. It’s never machine precision, it’s always some coding error or data issue.

4 thoughts on “Showdown in Vegas: When the numbers differ in the third decimal place”

Craig M on December 30, 2015 1:45 PM at 1:45 pm said:

Sometimes it’s a coding error, just not the user’s. Long ago, I was using a (pricey) commercial software package to verify some of the defendant’s parameter estimates for a given model and dataset as part of a multi-million dollar environmental damage lawsuit. The package was a newer version of the product which gave the original estimates, and the hardware platform was different. I found a discrepancy which was obviously bigger than machine precision, and after an embarrassing amount of time looking for any mistake I might have made, I tested the results using a different software package (duh). I contacted the tech support people, and less than a day later I received an email with a new software module, and suddenly the estimates matched. When I asked the tech support people what the issue was, I was told essentially “nevermind, nothing to see here, your problem is fixed”.

He was right. I stopped using their product for other work, because I couldn’t trust results after that appalling lack of transparency. My problem was solved, but not in the way they thought it was.

Reply ↓
Bob Carpenter on December 30, 2015 2:19 PM at 2:19 pm said:

The answer to the first question is that it’s not L-BFGS, but the Stan model that imposes the penalty (if any).

Reply ↓
Bob Carpenter on December 30, 2015 2:25 PM at 2:25 pm said:

As The Pragmatic Programmer puts it, ‘select’ isn’t broken.

But as Craig S. points out, sometimes it is. I wrote a blog post years ago detailing some counterexamples I found with various Java releases.

Andrew probably didn’t follow the ongoing discussion on stan-dev about arithmetic precision. It turns out the Intel compiler generates sloppy code that falls short of full double precision and doesn’t quite implement all the edge cases in the usual way. This causes a whole bunch of unit tests in Stan to break. In the end, we reduced the tolerance on unit tests and if-def-ed our way around the problem on the testing side, because we wanted the tests to pass in the Intel compiler (arguably that was the wrong thing to do — the tests were correct as written, it’s just that the Intel compiler didn’t live up to the precision of the other compilers like g++ and clang++ and MSVC

The other time we had this problem was on Windows — we found just slightly different behavior on an HMC run that gradually drifted, then finally drifted far enough that a draw was rejected, and then the results got really different. Again, not much to do here other than modify the tests — floating point isn’t an exact science in C++ code.

Reply ↓
Chris G on December 30, 2015 10:06 PM at 10:06 pm said:

> This sort of thing has happened to me many times. I do some calculation in two different ways and the answers differ in the third decimal place. It’s never machine precision, it’s always some coding error or data issue.

Ditto.

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Showdown in Vegas: When the numbers differ in the third decimal place

4 thoughts on “Showdown in Vegas: When the numbers differ in the third decimal place”

Leave a Reply to Bob Carpenter Cancel reply