BridgeStan: Stan model log densities, gradients, Hessians, and transforms in R, Python, and Julia

Posted on December 9, 2022 3:00 PM by Bob Carpenter

We’re happy to announce the official 1.0.0 release of BridgeStan.

What is BridgeStan?

From the documentation:

BridgeStan provides efficient in-memory access through Python, Julia, and R to the methods of a Stan model, including log densities, gradients, Hessians, and constraining and unconstraining transforms.

BridgeStan should be useful for developing algorithms and deploying applications. It connects to R and Python through low-level foreign function interfaces (.C and ctypes, respectively) and is thus very performant and portable. It is also easy to install and keep up to date with Stan releases. BridgeStan adds the model-level functionality from RStan/PyStan that is not implemented in CmdStanR/CmdStanPy.

Documentation and source code

BridgeStan documentation (installation, getting started, and interface APIs)
BridgeStan source repository (GitHub)

Detailed forum post

Here’s a post on the Stan forums with much more detail:

Stan Forums post: announcing BridgeStan 1.0.0

Among other things, it describes the history and relations to other Stan-related projects. Edward Roualdes started the project in order to access Stan models through Julia, then Brian Ward and I (mostly Brian!) helped Edward finish it, with some contributions from Nicholas Siccha and Mitzi Morris.

Stan downtown intern posters: scikit-stan & constraining transforms

Posted on August 5, 2022 3:00 PM by Bob Carpenter

It’s been a happening summer here at Stan’s downtown branch at the Flatiron Institute. Brian Ward and I advised a couple of great interns. Two weeks or so before the end of the internship, our interns present posters. Here are the ones from Brian’s intern Alexey and my intern Meenal.

Alexey Izmailov: scikit-stan

Alexey built a version of the scikit-learn API backed by Stan’s sampling, optimization, and variational inference. It’s plug and play with scikit.learn.

Alexey Izmailov. 2022. You like scikit-learn? You like Stan? You love scikit-stan! Poster.

Meenal Jhajharia: unconstraining transforms

Meenal spent the summer exploring constraining transforms and how to evaluate them with a goal toward refining Stan’s transform performance and to add new data structures. This involved both figuring out how to evaluate them (vs. target distributions w.r.t. convexity, condition if convex, and sampling behavior in the tail, body, and near the mode of target densities). Results are turning out to be more interesting than we suspected in that different transforms seem to work better under different conditions. We’re also working with Seth Axen (Tübingen) and Stan devs Adam Haber and Sean Pinkney.

Meenal Jhajharia. 2022. Efficient Unconstraining Parameter Transforms for Hamiltonian Monte Carlo. Poster.

They don’t make undergrads like they used to

Did I mention they were undergrads? Meenal’s heading back to University of Delhi to finish her senior year and Alexey heads back to Brown to start his junior year! The other interns at the Center for Computational Mathematics, many of whom were undergraduates, have also done some impressive work in everything from using normalizing flows to improve sampler proposals for molecular dynamics to building 2D surface PDE solvers at scale to HPC for large N-body problems. In this case, not making undergrads like they used to is a good thing!

Hiring for next summer

If you’re interested in working on statistical computing as an intern next summer, drop me a line at [email protected]. I’ll announce when applications are open here on the blog.

Summer internships at Flatiron Institute’s Center for Computational Mathematics

Posted on December 6, 2021 5:50 PM by Bob Carpenter

[Edit: Sorry to say this to everyone, but we’ve selected interns for this summer and are no longer taking applications. We’ll be taking applications again at the end of 2022 for positions in summer 2023.]

We’re hiring a crew of summer interns again this summer. We are looking for both undergraduates and graduate students. Here’s the ad.

Job ad: Summer Research Associate, Flatiron Institute, CCM

I’m afraid the pay is low, but to make up for it, we cover travel, room, and most board (3 meals/day, 5 days/week). Also, there’s a large cohort of interns every summer across the five institutes at Flatiron (biology, astrophysics, neuroscience, quantum physics, and math), so there are plenty of peers with whom to socialize. Another plus is that we’re in a great location, on Fifth Avenue just south of the Flatiron Building (in the Flatiron neighborhood, which is a short walk to NYU in Greenwich Village and Google in Chelsea as well as to Times Square and the Hudson River Park).

If you’re interested in working on stats, especially applied Bayesian stats, Bayesian methodology, or Stan, please let me know via email at [email protected] so that I don't miss your application. We have two other Stan devs here, Yuling Yao (postdoc) and Brian Ward (software engineer).

We're also hiring full-time permanent research scientists at both the junior level and senior level, postdocs, and software engineers. For more on those jobs, see my previous post on jobs at Flatiron. That post has lots of nice photos of the office, which is really great. Or check out Google's album of photos.

Posted in Bayesian Statistics, Jobs, Stan | Tagged C++, internship, Python, R, Stan | 6 Replies

Naming conventions for variables, functions, etc. Posted on March 11, 2020 4:44 PM by Bob Carpenter 21 The golden rule of code layout is that code should be written to be readable. And that means readable by others, including you in the future. Three principles of naming follow: 1. Names should mean something. 2. Names should be as short as possible. 3. Use your judgement to balance (1) and (2). The third one’s where all the fun arises. Do we use “i” or “n” for integer loop variables by convention? Yes, we do. Do we choose “inv_logit” or “inverse_logit”? Stan chose “inv_logit”. Do we choose “complex” or “complex_number”? C++ chose “complex”, as well as choosing “imag” over “imaginary” for the method to pull the imaginary component out. Do we use names like “run_helper_function”, which is both long and provides zero clue as to what it does? We don’t if we want to do unto others as we’d have them do unto us. P.S. If the producers of Silicon Valley had asked me, Winnie would’ve dumped Richard after a fight about Hungarian notation, not tabs vs. spaces. Posted in Statistical Computing | Tagged C++, Python, R | 21 Replies A Primer on Bayesian Multilevel Modeling using PyStan Posted on June 9, 2016 1:00 PM by Bob Carpenter 4 Chris Fonnesbeck contributed our first PyStan case study (I wrote the abstract), in the form of a very nice Jupyter notebook. Daniel Lee and I had the pleasure of seeing him present it live as part of a course we were doing at Vanderbilt last week. A Primer on Bayesian Multilevel Modeling using PyStan This case study replicates the analysis of home radon levels using hierarchical models of Lin, Gelman, Price, and Kurtz (1999). It illustrates how to generalize linear regressions to hierarchical models with group-level predictors and how to compare predictive inferences and evaluate model fits. Along the way it shows how to get data into Stan using pandas, how to sample using PyStan, and how to visualize the results using Seaborn. view HTML (mc-stan.org) source repository (GitHub) As an added bonus, if you follow the link to the source repo on GitHub, you’ll find a Gaussian process case study. I haven’t even had time to look at it yet, but if it’s as nice as this radon study, it’ll be well worth checking out. P.S. If you’re wondering what one of the core PyMC developers was doing writing PyStan examples, it was because he invited us to teach a course on RStan at Vanderbilt to his biostatistics colleagues who didn’t want to learn Python. It was extremely generous of him to put promoting good science ahead of promoting his own software! Part of our class was on teaching Bayesian methods and how to code models in Stan, and Chris offered to do some case studies, which is what Andrew usually does when he’s the third instructor. Chris said he tried RStan, but then bailed and went back to Python where he could use familiar and powerful Python tools like pandas and numpy and seaborn. It’s hard to motivate learning a whole new language and toolchain just to write one example. The benefit to us is that we now have a great PyStan example. Thanks, Chris! Posted in Bayesian Statistics, Multilevel Modeling, Stan, Statistical Computing, Statistical Graphics | Tagged PyStan, Python, radon | 4 Replies

Art Bayesian Statistics Causal Inference Decision Analysis Economics Jobs Literature Miscellaneous Science Miscellaneous Statistics Multilevel Modeling Papers Political Science Public Health Sociology Sports Stan Statistical Computing Statistical Graphics Teaching Zombies RenoTownie on It’s bezzle time: The Dean of Engineering at the University of Nevada gets paid $372,127 a year and wrote a paper that’s so bad, you can’t believe it.April 17, 2024 11:28 PM "This is Reno" adds that, though Dean Jones is still formally the Dean, "A source on campus said that Jones… chipmunk on Do research articles have to be so one-sided?April 17, 2024 11:26 PM Andrew, I know what you're thinking: What is Chipmunk talking about? "Soup" can't be rationally defined! No one really knows… chipmunk on Do research articles have to be so one-sided?April 17, 2024 11:20 PM Andrew, Jessica: The answer to this puzzling phenomenon lies in the media, the textbook sidebars and the Saturday morning cartoon… Raphael K on Do research articles have to be so one-sided?April 17, 2024 11:05 PM There is a popular paper called 'A Fine is a Price' (Gneezy and Rustichini, 2000; open access at https://www.jstor.org/stable/10.1086/468061), which… Anoneuoid on Intelligence is whatever machines cannot (yet) doApril 17, 2024 10:50 PM Gibble says the goal of moving to ever more accurate clocks isn't to more precisely measure time over a long… Daniel Lakeland on Intelligence is whatever machines cannot (yet) doApril 17, 2024 6:41 PM The definition of a second is accurate to infinite decimal points by definition. One second is 9192631770 cycles of the… Anonymous on “Close but no cigar” unit tests and bias in MCMCApril 17, 2024 6:00 PM This more PSA than a specific response to your post. I like property-based testing scientific code. Rather than writing individual… John Mashey on Intelligence is whatever machines cannot (yet) doApril 17, 2024 4:59 PM Indeed: "“I doubt people 50 years ago (1974) would have said you can play chess without being intelligent.” was certainly… Andrew on “Close but no cigar” unit tests and bias in MCMCApril 17, 2024 4:36 PM Bob, Great story. It reminds of something that happened when I was taking a mechanical engineering class in college. We… Shravan Vasishth on Do research articles have to be so one-sided?April 17, 2024 3:33 PM That's OK. Nobosy reads articles anyway. At most the title and if you are lucky, the abstract. That is why… Eric Kades on Simulation to understand two kinds of measurement error in regressionApril 17, 2024 3:18 PM Nice!! Animated GIFs are not just little toys. Andrew on Do research articles have to be so one-sided?April 17, 2024 2:18 PM Jessica: Writing books is a lot easier than writing papers because there's no publication bottleneck. You can write whatever you… Jessica Hullman on Do research articles have to be so one-sided?April 17, 2024 1:49 PM Yeah, I think the places where questioning one's results goes the most wrong is when the entire idea of what… Andres Moreira on Do research articles have to be so one-sided?April 17, 2024 1:35 PM Besides the already rare case of articles saying "we found X, but it may be wrong", it's entirely possible to… Andrew on Do research articles have to be so one-sided?April 17, 2024 1:13 PM Jessica: A lot depends on the venue. Sometimes the audience is willing to follow me wherever I go, and I… Jessica Hullman on Do research articles have to be so one-sided?April 17, 2024 12:50 PM "But, unless you’re Bob Carpenter, such an even-handed approach doesn’t come naturally, and, as always with this kind of adjustment,… Jessica Hullman on Do research articles have to be so one-sided?April 17, 2024 12:46 PM Really? I think of some CS as being among the worst because novelty is so prized! We often don't even… Dan Riley on Do research articles have to be so one-sided?April 17, 2024 10:53 AM In my experience, computer science does better than average at crediting prior art (w/useful references!) and making fair comparisons. I… Raghu Parthasarathy on Do research articles have to be so one-sided?April 17, 2024 10:25 AM I think this complaint misses the fact that there are, roughly, a billion articles published every year. My bioRxiv feed… Dale Lehman on Do research articles have to be so one-sided?April 17, 2024 10:07 AM Yes, the incentives provided by editorial policies, peer review, and employer vested interests are a large part of the problem.… Anonymous on Do research articles have to be so one-sided?April 17, 2024 9:33 AM In my experience this is strongly enforced by editors and peer review. If you point out the flaws in your… sdfasdfasdfasdfg on Do research articles have to be so one-sided?April 17, 2024 9:29 AM Isnt all this a little bit too one-sided against one-sidedness? Anoneuoid on Intelligence is whatever machines cannot (yet) doApril 17, 2024 9:26 AM I actually tried to use chatgpt again today and it couldn't give me a straight answer (just like every time).… Max Shepsi on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 17, 2024 5:52 AM On that logic, imagine if the control group had scored exactly the same post-treatment as pre-treatment... I guess those in… Yann de Mey on Simulation to understand two kinds of measurement error in regressionApril 17, 2024 3:46 AM I had also just re-created the example in Stata to use in my course next year. Complementary code ;) set… MG on Simulation to understand two kinds of measurement error in regressionApril 17, 2024 2:40 AM My intuition for this is that the slope coefficient is Cov(y,x)/Var(x). Cov(y,x) does not change by this type of measurement… RenoTownie on It’s bezzle time: The Dean of Engineering at the University of Nevada gets paid $372,127 a year and wrote a paper that’s so bad, you can’t believe it.April 16, 2024 9:01 PM The student newspaper, the Sagebrush, now has additional details. https://nevadasagebrush.com/2024/04/15/unr-dean-of-engineering-replaced-associate-dean-taking-place/ Anoneuoid on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 6:53 PM What exactly does one point mean on the test? From looking it up, it might be either remembering 1/15 extra… John Richters on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 5:48 PM Andrew: Methinks the unavailability of study data has nothing to do with the baseless security and privacy issues they cite… anon on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 5:42 PM This actually is a concern with MRI studies for privacy reasons. The anatomical scan contains facial features and not just… Richard on Simulation to understand two kinds of measurement error in regressionApril 16, 2024 5:02 PM This is very cool! One potential caveat I'd ask about is that in some disciplines (e.g., business, where I am… John N-G on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 4:15 PM The "226% improvement" comes from the control group scoring 0.73 points worse on a particular test administration post-treatment than pre-treatment,… Andrew on Simulation to understand two kinds of measurement error in regressionApril 16, 2024 3:24 PM Kj, Yes, good point. I think I was trying too hard to come up with an intuitive explanation here. Josh on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 3:17 PM 226 % improvement? But did they check for an interaction with participants zodiac signs? Anon on No, it’s not “statistically implausible” when results differ between studies, or between different groups within a study.April 16, 2024 3:12 PM Yeah, I think that fixed it, ta, at least my telephone doesn't choke up anymore. Strange. Raphael K on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 3:00 PM May I also refer the readers to the Pubpeer record of the study: https://pubpeer.com/publications/25269E619F766BEDBF5ACD4F4ADF82#1 It states that a new product… kj on Simulation to understand two kinds of measurement error in regressionApril 16, 2024 2:48 PM "Second, when you add noise to x you’re changing the ordering of the data, which will reduce the strength of… Holmes' Ghost on Evidence, desire, supportApril 16, 2024 12:47 PM Sorry, I know this is late, but I feel compelled to defend Holmes a bit here--he was one of the… Andrew on No, it’s not “statistically implausible” when results differ between studies, or between different groups within a study.April 16, 2024 12:36 PM Hmmm . . . for some reason the file I saved was really huge, and Wordpress does not want to… JimV on No, it’s not “statistically implausible” when results differ between studies, or between different groups within a study.April 16, 2024 12:10 PM My Windows 7 Professional 64-bit Acer laptop stops responding to mouse clicks for a subjectively long time during the loading… Dale Lehman on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 12:01 PM I'm not sure I understand your second point. Are you saying that protecting the privacy of participants should be the… JimV on Simulation to understand two kinds of measurement error in regressionApril 16, 2024 11:52 AM Reminds me of what R.A. Heinlein had a writer character say in "Stranger In a Strange Land": "Always put in… Chandrasekhar Ramakrishnan on Simulation to understand two kinds of measurement error in regressionApril 16, 2024 11:43 AM Adding my voice to the chorus exclaiming that this is fantastic and instructive post! For those who prefer python, here… Rodney Sparapani on No, it’s not “statistically implausible” when results differ between studies, or between different groups within a study.April 16, 2024 11:15 AM That is such a great Peto anecdote. I only met him once. And, coincidentally, he was talking about this subject.… Birdpipe on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 11:13 AM Raw data can be very helpful for those trying to break down an experiment or possibly try and replicate it.… Dale Lehman on N=43, “a statistically significant 226% improvement,” . . . what could possibly go wrong??April 16, 2024 10:34 AM But what if my MRI scans fell into the wrong hands and they figured out who I was? At least… Anon on No, it’s not “statistically implausible” when results differ between studies, or between different groups within a study.April 16, 2024 8:41 AM Hox hox Andrew, something fishy going on! My mini laptop crashes when opening your blog (Ubuntu 22.04) and my phone… Joshua on Intelligence is whatever machines cannot (yet) doApril 16, 2024 7:56 AM The author of those comments: https://x.com/ichbinilya/status/1780133566694768731 Joshua on Intelligence is whatever machines cannot (yet) doApril 16, 2024 7:56 AM So how about what AI can do that humans can't? This is kind of stunning. I think there must be… Dale Lehman on Intelligence is whatever machines cannot (yet) doApril 16, 2024 7:09 AM Aren't we tiring of these endless discussions about the "intelligence" of LLMs? As the LLMs get 'smarter' and humans get…

Statistical Modeling, Causal Inference, and Social Science

Tag Archives: Python

BridgeStan: Stan model log densities, gradients, Hessians, and transforms in R, Python, and Julia

Stan downtown intern posters: scikit-stan & constraining transforms

Summer internships at Flatiron Institute’s Center for Computational Mathematics

Naming conventions for variables, functions, etc.

A Primer on Bayesian Multilevel Modeling using PyStan