Rant on R indentation: why 4 instead of 2?

Posted on January 3, 2008 12:57 AM by Andrew

Why does R indent 4 spaces with its functions? Indenting 2 is much easier to read. I find functions in R packages to be really hard to read because they space things out so far. Compare this (from R2WinBUGS):

        varpostvar <- max(0, (((n - 1)^2) * varW + (1 + 1/m)^2 * 
            varB + 2 * (n - 1) * (1 + 1/m) * covWB)/n^2)to the line from the original function before it got put into a package:
    varpostvar <- (((n-1)^2)*varW + (1+1/m)^2*varB + 2*(n-1)*(1+1/m)*covWB)/n^2Here's another.  From R2WInBUGS:
        covWB <- (n/m) * (var(s2, xdot^2) - 2 * muhat * var(s2, 
            xdot))From the original function:
  covWB <- (n/m)*(cov(s2,xdot^2) - 2*muhat*cov(s2,xdot))

Is all that spacing really necessary??? It gets kinda ridiculous when single-line statements get broken into two lines.

Which reminds me . . .

When I write if() statements or loops, I always put in the brackets {}, even if there’s only one line inside the condition. It just makes the function easier to follow, also helps avoid errors if I change the function later. Why is the convention in R packages to not include the {}? I can see this being an issue inside nested loops where speed is a concern, but I see this everywhere.

11 thoughts on “Rant on R indentation: why 4 instead of 2?”

Hadley on January 2, 2008 8:38 PM at 8:38 pm said:

You can indent R code with as many spaces as you like, and whether or not you use {} with if has no speed implications. Where are you having these problems?
Tony Rossini on January 2, 2008 11:10 PM at 11:10 pm said:

There are alternative approaches for getting what you want, use them. For example, work through an editor, and only use an editor. Obviously that is the ESS way, though other tools can be made to work with some help. Or re-write the pretty-printer to provide output the way you want. It's open source!
Josh on January 3, 2008 2:29 AM at 2:29 am said:

Speaking for general coding style in other languages, and maybe from only my own personal experience, 4 spaces is probably the most common default for indentation. It's the default in many editors. I don't know how R decides when to break up a line, but some conventions recommend not going past 80 lines because of the old 80×24 terminal days. Not really necessary nowadays though.

I personally used to use 2, but switched to 4 just because "everyone" uses it and it made it easier for me to collaborate on other projects. As for the {}, I also prefer to always put them in.
Gregor Gorjanc on January 3, 2008 4:45 AM at 4:45 am said:

I agree with you that 2 spaces are more than fine and I also dislike the way R prints out the code – to much spaces. Since there are people who are willing to start a fight about how many spaces is the best choice for identation, I really do not care much and I do try to be consistent with my code. I also use {} as much as possible, but not always. I also look in if and for as a function, therefore I use if() and for() instead of if () and for ().
Andrew on January 3, 2008 6:53 AM at 6:53 am said:

Hadley,

I write my code in emacs and indent 2 spaces. But when my functions get converted to R packages, they end up with 4 spaces. And existing R functions (which I sometimes end up editing) get indented with 4 spaces. I find these 4-space-indented functions much harder to read and to work with.

Tony,

I don't know what the pretty-printer is so I'm not in a position to rewrite it!
mjm on January 3, 2008 8:24 AM at 8:24 am said:

To follow up on Tony's point, R shouldn't be doing anything to the code when it's packaged. I have plenty of pkg source that is as the authors wrote it (which often but not always has the delicate fingerprints of emacs on it).

I'm now wondering why the pretty-printer — isn't the same as that in emacs. It seems like the mac gui at least is moving in that sort of direction with its fancy/irritating overzealous paren and quote doubling.

What you can do though is check out the "print" options. In particular "keep.source" and "keep.source.pkgs" seem like useful things to turn on to keep the internal not-so-pretty printer from doing ugly things.
Peter on January 3, 2008 8:26 AM at 8:26 am said:

Personally, I like the 4 space indenting, and I like to spread my functions and programs over a lot of lines. I might even use more lines than the package examples that you post.

Seems like a matter of taste, and what you're used to…and maybe visual acuity and font size!

Peter
mjm on January 3, 2008 8:31 AM at 8:31 am said:

Whoever wrote the pretty printer also came down on the side of inline block openings (which I like but apparently half the world doesn't) :
<pre>for (iterator in sequence) {
blah
}</pre>not<pre>for (iterator in sequence)
{
blah
}</pre> I couldn't find it quickly for ess-mode, but see the emacs cc-mode manual.
Martyn on January 3, 2008 11:03 AM at 11:03 am said:

You can preserve the original formatting of your functions in an R package, including comments. But you must disable lazy loading. Put the line "LazyLoad: no" in the DESCRIPTION file.

If a user sets the option "keep.source.pkgs" to TRUE before loading the package, then all the functions will have an additional attribute called "source". This is a character vector containing the original source code, which will be used when you print or edit the function.
Hadley on January 3, 2008 11:44 AM at 11:44 am said:

I'm not sure what you mean by "converted to R packages" – do you mean when you view the source of the function from the command line?

The reason they look different is because most packages don't store the text of function, but the parsed R expression – this makes packages smaller and faster to load. If during debugging your package, you want to see the original code, you need to set LazyLoad: false in DESCRIPTION and then execute options(keep.source.pkgs = TRUE) before loading your package.
tony rossini on January 5, 2008 1:54 AM at 1:54 am said:

With respect to the points I made, a bit of clarity — the pretty-printer I'm referring to is, as others mentioned, the function that R uses to print back code stored in itself, which is where Andrew is having his problems. The only real solutions are to either ignore the response (i.e. results coming back from R), reformat the response (there are tools in ESS to cleanup the output), or to ignore the response (treating source code as primary, ignoring anything fed into R as munched-up garbage not suitable for reuse). Andrew rightly points out that using other peoples code means getting it not in the way they wrote it, unless you tear into source packages. But that is where a reformatter can be useful. M-x ess-MM-fix-source is one way to do that. Hadley suggested another as well.

Comments are closed.