From the Stan users list:
I have just started to look into the output of the optimizing function and it seems to give estimates slightly different than the ones that I had previously obtained through maximum likelihood estimation (using MATLAB). Can you please tell me what is the penatly that the LBFGS algorithm imposes?
In addition, is there any way to perform “optimizing” for maximum likelihood estimation and not penalized maximum likelihood estimation?
The second question was easy to answer: When set to optimize, Stan maximizes whatever function is given. If you give it the log-likelihood, Stan will give you the maximum likelihood estimate (or, more precisely, a local mode, or perhaps multiple modes if you run many chains). If you give it a penalized log-likelihood, Stan will give you the penalized maximum likelihood estimate.
The first question was more interesting and provoked some discussion of the possibility that the differences in the algorithms came from issues of machine precision.
Later the person posted an update:
The error was very small, but bigger than what the tolerance could account for, but I have just noticed that I had indeed a small difference between models.
I am getting exactly the same values now.
This sort of thing has happened to me many times. I do some calculation in two different ways and the answers differ in the third decimal place. It’s never machine precision, it’s always some coding error or data issue.