Exposure to Stan has changed my defaults: a non-haiku

Posted on February 24, 2017 10:24 AM by Andrew

Now when I look at my old R code, it looks really weird because there are no semicolons
Each line of code just looks incomplete
As if I were writing my sentences like this
Whassup with that, huh
Also can I please no longer do <-
I much prefer =
Please

28 thoughts on “Exposure to Stan has changed my defaults: a non-haiku”

Gregor Thomas on February 24, 2017 12:01 PM at 12:01 pm said:

Using = for assignment in R works just fine! I switched to = a few years ago and I love it.

See [Assignment Operators in R on Stack Overflow](http://stackoverflow.com/q/1741820/903061) for some details.

Reply ↓
- Andrew on February 24, 2017 12:17 PM at 12:17 pm said:
  
  Gregor:
  
  Using = does work, but not always. I’ve been told that it occasionally fails, so I’m loath to use = in the code I use in our books, because I fear that it might mess up if people try to include such code in functions or whatever.
  
  Reply ↓
  - Gregor Thomas on February 24, 2017 1:15 PM at 1:15 pm said:
    
    The only case where it fails is you’re trying to do assignment *inside* a function call – which is generally not done. For example, with <-, it's possible take the mean of 1:10 and assign 1:10 to x all at once:
    
    z <- mean(x <- 1:10) ## this will create both z and x in the global environment, z is 5.5, x is 1:10
    
    z = mean(x = 11:20) ## this will create (or modify) z in the global environment, but not do anything to x
    
    I've never wanted to assign variable something inside a function call, so I don't miss that functionality. I actually see it as a benefit of using = If I'm turning a script into a function and my assignments in the script were done with = I can copy/paste them into the function arguments. If I used <- in the script and did the same copy paste, the function might still work but it would have unintended consequences of all the external assignments.
    
    There can also be issues of precedence (assoctiatively) if you mix and match = with <- in a compound assignment… `x <- y = 5` is different than `x <- y <- 5`, but `x <- y <- 5` is the same as `x = y = 5`. As long as you don't mix and match there is no issue. (And how often do you use compound assignment?)
    
    If you follow two easy good practices: (1) don't assign things inside function calls, (2) don't do compound assignment (at least not with different assignment operators), there won't be any other issues.
    
    Reply ↓
    - Jonah on February 24, 2017 6:10 PM at 6:10 pm said:
      
      It’s true that there are not so many cases in which it matters whether you use = or <- for assignment. I guess the biggest problem with = for assignment is not really a problem with using = for assignment but just that it results in R code that really does look foreign to most R other users, especially R users who aren't already well versed in other programming languages. For better or worse (probably worse) most R users are going to be using <- for the foreseeable future. So in a textbook, at least, I think it makes sense to go with the standard.
      
      I suppose another reason some people prefer <- is that = is used for other things like case statements and argument binding, so some people might prefer to not overload it further by using it for assignment too. Personally I like = (and I'm glad we made the switch from <- to = for Stan), but I continue to use <- in R code, at least for now.
    - Gregor Thomas on February 25, 2017 2:31 AM at 2:31 am said:
      
      I agree that in a textbook setting going with the far more common <- makes sense. But on blogs or on Stack Overflow I use = and hope that some users will see it and like it and follow suit.
    - Wayne Folta on February 25, 2017 8:29 AM at 8:29 am said:
      
      Actually, it’s fairly common to use assignment in an R function call in a perfect legitimate (and unavoidable) way. For example, say I have
      
      `foo <- brm (y ~ x1 + x2 + (1 | x3) + (1 | x4), data=bar)`
      
      and I decide I want to time it. I simply do:
      
      `system.time (foo <- brm (y ~ x1 + x2 + (1 | x3) + (1 | x4), data=bar))`
      
      and R is not confused by what I'm trying to do. If you use "=" instead, system.time could think you're trying to pass the parameter "foo", which it hopefully does not have.
      
      There are three distinct concepts here: equality testing, named parameters, and assignment. R chooses to use "==", "=", and "<-", while other languages make other choices. For example, languages in the Algol family used ":=" for assignment.
      
      R is very powerful in that functions, formulas, etc, are first-class objects so distinctions have to be made where lesser languages don't have to. (Or the language may have made other choices.)
      
      This argument sounds a lot like: "This whole mean, median, mode thing is tedious. Since I hate to use two syllables where one will do, from now on I'm going to use 'mean' in all three cases and let context distinguish."
    - Paweł Piątkowski on February 25, 2017 9:00 AM at 9:00 am said:
      
      Simply enclose the expression in (curly or round) brackets, and you’ll be perfectly fine:
      
      system.time({foo = brm (y ~ x1 + x2 + (1 | x3) + (1 | x4), data = bar)})
      
      Other than that, there are *no* differences between these two. Trust me :-)
      You should be consistent, though – not only because it’s encouraged by all coding styles, but because the two operators have (for some reason) different operator precedence. This will work:
      
      x = y = 5
      
      This too:
      
      x <- y <- 5
      
      And this:
      
      x = y <- 5
      
      But not this one (!):
      
      x <- y = 5
    - Paweł Piątkowski on February 25, 2017 9:11 AM at 9:11 am said:
      
      Ouch, I didn’t notice Gregor’s comment on operator precedence. Unknowingly, not only did I demonstrate the same concept, but also used the same variables and values :-)
      Sorry!
- Ian Fellows on February 24, 2017 2:44 PM at 2:44 pm said:
  
  You can use =, but I would encourage you not to. Virtually all coders use <- and all style guides recommend <-. When I see = used for assignment in R code, it usually indicates a novice coder or someone who has just transitioned from another language.
  
  That said, there is an alternate more rational universe where <- doesn't exist and we all use =.
  
  Reply ↓
  - Andrew on February 24, 2017 3:12 PM at 3:12 pm said:
    
    Ian:
    
    +1 on your second paragraph.
    
    Reply ↓
  - Paul on February 24, 2017 5:58 PM at 5:58 pm said:
    
    Well, I use R on a regular daily basis for nearly ten years now, so I would guess I’m not a novice anymore. But I still prefer to use “=” as an assignment operator, the code looks much cleaner and intuitive this way.
    
    Reply ↓
    - Ian Fellows on February 26, 2017 7:11 PM at 7:11 pm said:
      
      @paul I in no way want to imply that any particular person is not skilled based on their stylistic choices. It is just something that I tend to notice reading lots of peoples code. As a coder gains in experience, they tend to work with many peoples’ code in collaboration. Since almost everyone else uses <-, having one file, or one subsystem use = is very jarring when you are managing a large codebase.
CuriousGeorge on February 24, 2017 2:48 PM at 2:48 pm said:

Using “=” as an assignment operator doesn’t make sense. “” are more sensible. The problem with “<-" is that you might naturally type "x<-3" with the intention of having it mean "x is less than negative three". An assignment operator shouldn't depend on white space.

Reply ↓
- CuriousGeorge on February 24, 2017 2:49 PM at 2:49 pm said:
  
  That statement in the quotes was supposed to have indicated the -> and <- operators, but it looks like they got dissolved by the commenting system.
  
  Reply ↓
anon on February 25, 2017 5:21 AM at 5:21 am said:

Assignment isn’t a symmetric relation. Unlike “=”, “<-" makes that immediately clear.

Reply ↓
- David P on February 25, 2017 7:23 AM at 7:23 am said:
  
  The unsymmetric use of “=” as assignment is no less clear in R than in any other programming language. I switched over a few years ago and haven’t had any problem with it.
  
  Reply ↓
jrkrideau on February 25, 2017 12:03 PM at 12:03 pm said:

I am a <- person. I find it makes the code clearer and much easier to read.

Reply ↓
Elio on February 25, 2017 1:24 PM at 1:24 pm said:

I use <- basically because I learned that way and now it's just tradition. I like it better the way the code looks (again, 100% because of tradition) even though it kind of is a pain in the ass to type (even worse know that I bought a new laptop with a different keyboard layout!).
I would love to switch to =, but I just find it ugly. Damn me!

Reply ↓
- jrkrideau on February 26, 2017 5:44 PM at 5:44 pm said:
  
  I simply use Autokey and assign a code to it. A the moment I use / + aa to get <- Actually as easy as typing = which is a long reach, and I find the difference between <- and = worth maintaining. I still maintain is makes for clearer code. Thousands will disagree.
  
  I started out with Fortran so I am used to = as an assignment statement. Given R syntax I prefer <- .
  
  Reply ↓
Max on February 25, 2017 9:00 PM at 9:00 pm said:

Sometimes I use “->”. Please don’t judge me.

Reply ↓
- jrkrideau on February 26, 2017 5:59 PM at 5:59 pm said:
  
  Heretic! The Inquisition has been notified.
  
  RECANT, RECANT! It is not too late.
  
  I have never used -> but I think I have seen it. One does, sometimes, wonder about the people who wrote R.
  
  Reply ↓
  - Ian Fellows on February 26, 2017 7:59 PM at 7:59 pm said:
    
    It helps to remember how old S (err, R) is. It is a marvel how modern it is given it’s roots in the 1970s. Just thank the lord that the creators of R, in their infinite wisdom, abandoned S’s blasphemy of dynamic scope.
    
    Reply ↓
A saucy young trollop on February 26, 2017 5:50 AM at 5:50 am said:

for(int i = 0; i < —

ah, I'm writing R, dang.

public static double informationGain(doubl—

feck, I should be writing R!

*waiting for a non-vectorizable for-loop to be evaluated*

Yes, I'm writing R.

Reply ↓
Bob Carpenter on February 26, 2017 4:19 PM at 4:19 pm said:
Yup. Turns out punctuation matters for ease of reading. Andrew originally mocked Stan’s syntax, calling it “BUGS with semicolons”.

The secondary motivation for semicolons is that it renders the language whitespace insensitive in the sense that wherever one whitespace can occur it can be one or more of any type of whitespace (tab, return, line feed, space). R, in contrast, is sensitive to the distinction between a newline and an ordinary space character. For example, this script
```
a <- b -
     c
```
will assign b - c to a, whereas
```
a <- b
     - c
```
will assign b to a and return -c.

Now mathematics typesetting standards always put the operator first in a line of text when continuing a line because it makes it way easier to scan the structure of a formula (such as a sequence of sums).
Reply ↓
- Ian Fellows on February 26, 2017 6:58 PM at 6:58 pm said:
  
  I like the focus on the fundamental principle of white space insensitivity. If you are going to have it, you need end line markers. Being whitespace insensitive also means that writing code into string literals for execution, something done with STAN quite a lot, is less prone to error.
  
  Ascetically, I like the other extreme of full whitespace utilization. I think the following looks nice for instance:
  
  transformed data:
  real y[5]
  y[1] = 2.0
  y[2] = 1.0
  y[3] = -0.5
  y[4] = 3.0
  y[5] = 0.25
  
  parameters:
  real mu
  
  model:
  for n in 1:5:
  y[n] ~ normal(mu,1.0)
  
  Reply ↓
  - Ian Fellows on February 26, 2017 7:00 PM at 7:00 pm said:
    
    And of course the blog mangled my pythonic indentation, proving the point about passing code around as strings. :)
    
    Reply ↓
jrkrideau on February 26, 2017 6:05 PM at 6:05 pm said:

I had to run the code through R before I realized how obvious this was. I felt like a fool.

Reply ↓
Steve Davenport on February 26, 2017 8:55 PM at 8:55 pm said:

Would someone like to make the fuller version of the argument that I should begin to use semi-colons to stack several “independent clauses” of R code onto the same line, rather than the standard one-line-per-code-clause? I’m interested in hearing it and willing to switch, but generally, I’ve worried that using semi-colons would lead me to forgetting to see certain lines of code, and it prevents me from easily moving code up/down lines as needed.

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Exposure to Stan has changed my defaults: a non-haiku

28 thoughts on “Exposure to Stan has changed my defaults: a non-haiku”

Leave a Reply Cancel reply