Skip to content
 

One simple trick to make Stan run faster

Did you know that Stan automatically runs in parallel (and caches compiled models) from R if you do this:

source(“http://mc-stan.org/rstan/stan.R”)

P.S. This capability is automatically in the current version of rstan which you can load in from Cran.

19 Comments

  1. Rahul says:

    Can someone elaborate as to why the trick works? Sounds like an R idiosyncracy.

  2. John Hall says:

    @Rahul the source command loads in the R code. It’s used when you keep functions in separate documents. In this case, the function is located at an email address. From what I can tell, the function is basically a version of the stan function set up to always use parallel. Not sure how well it would work on windows machines.

    • Rahul says:

      Thx. So then source sounds the equivalent of standard Linux shell’s source command?

    • Corey says:

      This version of the stan function uses “mclapply”, and on Windows, mclapply just calls (the standard non-parallelized function) “lapply” (unless you try to set the mc.cores argument to a value greater than one, in which case it throws an error).

      (That address is a URL, not an email address…)

  3. gwern says:

    If it’s so useful, why isn’t Stan doing this by default? Could use https://stat.ethz.ch/R-manual/R-devel/library/parallel/html/detectCores.html to detect at runtime the amount of parallelism available.

    • Andrew says:

      Gwern:

      Stan is doing it by default! You just source that code to set up the stan() function. The only issue is it can’t be inside the rstan package because of Cran restrictions.

      • David J. Harris says:

        What aspect of this would violate CRAN policies? I’ve seen lots of packages that “suggest” the `parallel` package and only use it if available.

        Are the CRAN violations Windows-specific? If not, could the parallel functionality still be added for Mac and Linux users?

        Thanks

  4. I’d like to point out that this is a TERRIBLE way to get the functionality. Specifically, there could be anything in that .R file on the server, so for example it might contain code to maliciously delete everything in your home directory, or whatever.

    Even if today there’s nothing wrong with the code, tomorrow some script-kiddie could find a vulnerability and replace that code on the server with their malicious file deleting, or personal data collecting alternative code.

    Never source something from a URL, go and get the contents, put it in your own directory, verify that it seems reasonable, and then source your local copy!

    • Rahul says:

      It is interesting that R actually allows a source cmd to execute with a remote file over the net.

      I wonder if a Linux shell’s source command will source a remote script.sh over the net. I’ve never tried it.

      • martin says:

        Yes, the shell will allow one to do all sorts of things. I’ve seen people using curl to fetch some script and pipe it directly to sh as a means of running installers. Not recommended.

  5. Ben Goodrich says:

    The parallel thing doesn’t violate CRAN restrictions; the other part of that function which silently writes the compiled model to the disk does. However, we have been given permission to write if the user specifies a non-default value for one of the options(). We might implement some options this week to facilitate doing chains in parallel too, but the logic of that function is essentially only right for a person working on a multicore laptop with lots of RAM relative to the model, which happens to be the only way Andrew uses Stan. To force it to always run that way would make it not work on clusters or for people with little RAM relative to the model.

  6. Maciej says:

    Did you removed the code from the server? Link redirect to stan webpage.

  7. Chris says:

    The link appears to be have been removed since the website got a makeover?
    Any idea where I can find this again?

Leave a Reply