Return-Path: X-Original-To: apmail-couchdb-dev-archive@www.apache.org Delivered-To: apmail-couchdb-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BD30A103E8 for ; Fri, 4 Apr 2014 06:56:36 +0000 (UTC) Received: (qmail 86572 invoked by uid 500); 4 Apr 2014 06:56:35 -0000 Delivered-To: apmail-couchdb-dev-archive@couchdb.apache.org Received: (qmail 86151 invoked by uid 500); 4 Apr 2014 06:56:33 -0000 Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@couchdb.apache.org Delivered-To: mailing list dev@couchdb.apache.org Received: (qmail 86124 invoked by uid 99); 4 Apr 2014 06:56:30 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Apr 2014 06:56:30 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of bchesneau@gmail.com designates 209.85.192.49 as permitted sender) Received: from [209.85.192.49] (HELO mail-qg0-f49.google.com) (209.85.192.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Apr 2014 06:56:25 +0000 Received: by mail-qg0-f49.google.com with SMTP id j107so1521696qga.8 for ; Thu, 03 Apr 2014 23:56:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=foTh0XPA8wZRS268EjpUm5udkah/f6JeVJFYgviXvuI=; b=WtDc4j7eYZv1NdVeu9HLmvswvHdfPANp5wn9Abv2oXMIy8yPa0zCu+5W+3C7Fm1fK4 xYAb3MoEyisdSvSivPcMJKhJyjbt7lGLSoQVf0my2GMuwdKJE8xZjHQKO0S2bV3GT5VW nP1VBkT8y8f+gMwcz6/a+I/elXCcRN0ONQ7t70mO9eWcuBaZdhxPt7CYFKepf5L0qAoj MuL6STfuCG7pHTHyilU3VDetj5sFGam3CgJ5yRxwWUzdTpevr20njoCbA0KFkrrqX0h4 6KoYMjUK0uvZwE/lDdjwMPK8/tTyrtdXGTwpj9erWa9th1AAWhsz7+GRDNervpbFdvCK gWKA== X-Received: by 10.140.105.118 with SMTP id b109mr440988qgf.28.1396594563086; Thu, 03 Apr 2014 23:56:03 -0700 (PDT) MIME-Version: 1.0 Received: by 10.96.30.230 with HTTP; Thu, 3 Apr 2014 23:55:43 -0700 (PDT) In-Reply-To: References: <22926798.38.1396589070168.JavaMail.Joan@RITA> From: Benoit Chesneau Date: Fri, 4 Apr 2014 08:55:43 +0200 Message-ID: Subject: Re: [DISCUSS] Erlang whitespace standards (was: [POLL]) To: "dev@couchdb.apache.org" Content-Type: multipart/alternative; boundary=001a1137d4a4d3ffae04f632025f X-Virus-Checked: Checked by ClamAV on apache.org --001a1137d4a4d3ffae04f632025f Content-Type: text/plain; charset=ISO-8859-1 +1 for a format tool like gofmt. Other than that we may want to provides vim and erlang syntax (probably others if you,re using other). But having a tool like gofmt is definitely the best, customising its software is a lot of time, not sure I want to do that job / product / language... - benoit On Fri, Apr 4, 2014 at 8:42 AM, Paul Davis wrote: > Quick glance, and my first thought was that maybe we should link to > gists for each example. Gmail at least is screwing up the examples. A > few more minor points inline. > > On Fri, Apr 4, 2014 at 12:24 AM, Joan Touzet wrote: > > Hi everyone, > > > > Time to summarize the results. You can view the results at > > > > > https://docs.google.com/forms/d/1b7KcQGgNbSCZVRwLjrUl5Z6C2TBx8X1btlU5fwrNHpg/viewanalytics > > > > but I've included them in this email for ease of review. > > > > I'm going to break this up into sections and make some PROPOSALs. I'd > > like to get general consensus on this vs. a "lazy" approach. I don't > > see this as something where need a unanimous vote but I'd like to get us > > all agree on something we can live with. > > > > As for getting this into the code base - let's not endanger the big > > merges, but once we have finished them, we should move to these > > standards piecemeal as we rework each file, as Noah and Jan suggest, > > unless someone wants to do the busy work and re-indent everything. > > Hopefully, even with the wait for the merges, this means the standard > > can be "live" before the end of 2014 ;) > > > > I don't cover all topics in here - please feel free to follow the post's > > format and add additional proposals in follow-ups. > > > > Finally, if I say something you disagree with or if I have misinterpreted > > your response, speak up - it was not intentional! > > > > -Joan > > > > ----- > > > > TERMINOLOGY USED: > > * "X space indent" means X spaces from the LEFT MARGIN. > > It is the ABSOLUTE number of columns of whitespace on a line. > > > > * "Y space standard" means indentations should be multiples > > of Y spaces. > > > > * "Z level indent" means Z*Y=X absolute spaces for the indent. > > For a 4-space standard, a 2 level indent would mean an 8 space > > indent. > > > > ----- > > > > STANDARD: Agree on a 4-space standard for horiz. indentation. Most of > > the respondents seem to be comfortable with this, likely due to the > > prevalence of the Python / Ruby / JS 4-space standard. > > > > PROPOSAL: "Indent your code blocks with 4 spaces. Never use tabs or a > > mix of tabs and spaces. When additional indentation levels are needed, > > always increment by a multiple of 4 spaces." > > > > This sets us up to be able to have the same spacing standard across JS, > > C and other languages we may someday ship. > > > > ----- > > > > LINE LENGTH: 11 votes for 80, 6 votes for 132, 1 for 76. > > > > PROPOSAL: "Maximum line length is 80 characters, with a preference for > > 76 characters or less. Exception: URLs in comments" > > > > ----- > > > > CASE STATEMENT INDENTATION: 16 in favour of this format, 3 opposed: > > > > get_ini_files(Default) -> > > case init:get_argument(couch_ini) of > > error -> > > Default; > > {ok, [[]]} -> > > Default; > > {ok, [Values]} -> > > Values > > end. > > > > This format matches Erlang documentation and is fairly canonical. > > > > PROPOSAL: "Indent case pattern clauses 1 level, and each case pattern > > body 2 levels from the initial case statement." > > > > ----- > > > > CASE STATEMENT ONE-LINERS: 11 in favour, 8 opposed: > > > > case {Name, Pass} of > > {"Jan Lehnardt", "apple"} -> ok; > > ... > > > > The only write-in for this suggested that one-liners needed to fit on a > > single line "without looking terrible." > > > > PROPOSAL: "Generally, case pattern bodies should always start on a new > > line from their corresponding case pattern clause. However, you can put > > the clause and body on the same line if the entire statement fits on one > > line." > > > > This is a tough one because it directly contradicts the previous > > proposal. If people feel strongly I am OK to be more strict and remove > > "Generally, " and the second sentence from this proposal. > > > > I know I had a write in for this one that was "only if all clauses fit > on a single line". Not sure if you paraphrased that as "without > looking terrible" or not. But if I write three clauses as one-liners > and the fourth doesn't fit (or even if I feel like its just too long) > then I go back and undo each one so that the case bodies are > consistenly indented for the given case clause. > > > ----- > > > > LONG FUNCTION CLAUSE: > > > > 7 for paren aligned > > 4 for 2-space indented > > 5 for 8-space indented > > 1 for "2 space, but no arguments on the initial line, with > > the closing } on its own line" > > 1 for "4-space indented" > > 1 for "one tab" > > > > As a reminder, here is the code, paren aligned: > > > > possibly_embed_doc(#collector{db_name=DbName, query_args=Args), > > #view_row{key=_Key, id=_Id, value=Value, > doc=_Doc}=Row) -> > > > > And 8-space aligned: > > > > possibly_embed_doc( > > #collector{db_name=DbName, query_args=Args), > > #view_row{key=_Key, id=_Id, value=Value, doc=_Doc}=Row) -> > > > > > > Ideology here and on the list is split roughly into 2 camps: > > > > * Z-level indent of a multiple of 4 spaces. As the body of the > > function will start at 4 spaces, I am going to recommend > > against 1-level and say a 2-level (8 space) indent is the > > option here. > > > > * Emacs/paren indentation mode. I believe the big arguments for > > this mode is "it's what my editor does" and "it's common in > > strongly typed languages." If you feel differently, please > > speak up. On the other side, Paul feels strongly about not > > adopting this model; Wendall supports it and Bob N. says he > > can 'retrain himself' to use it. Notice also that, in this > > example, the second line ends on col. 78. Even if the -> was > > wrapped to the next line, the line still ends on col. 75. > > > > That's an interesting observation in the strongly typed vs. not > languages. I can't say I've ever noticed a preference based on the > type system. > > I will say that I don't put much stock into the "that's what my editor > does" argument. If you're editor is so fancy that it does syntatical > parsing to adjust indentation widths dynamically then surely its fancy > enough to have an off switch for that behavior. One of the best > reasons for using a Z*Y=X approach is that it fits most common > languages these days with at most a modification of Y based on file > extension. This coupled with file mode detection for when when indent > level changes leads to a fairly easy to use editor without making any > particularly language ridiculously different than another which helps > for people like me that read and write a number of different languages > with decent regularity. > > > Tough call here. Based on similarity with other popular languages of our > > day I'm going to initially propose the first option and let anyone who > > strongly opposes speak up now. There was no strong statement > > about whether the ) or -> should be on its own line, so I'll leave > > that part of the proposal vague for now. > > > > PROPOSAL: "Function definitions should align wrapped elements using a > > 2-level hanging indent. There should be no arguments on the first line. > > The closing parenthesis or arrow may be placed on its own line if > > desired, but if so, it should be indented the same number of spaces as > > the function definition itself." **but see below** > > > > ----- > > > > LONG FUNCTION CALL: > > > > 7 for paren-aligned > > 7 for 4-space indent > > 3 for 8-space indent > > 1 for "rework the code, or 4-space indent" > > 1 for "2 space, but no arguments on the initial line, with > > the closing } on its own line" > > > > As a reminder, here is the code, paren-aligned: > > > > [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":", > > [{return, list}, {parts, 3}]), > > > > And 8-space aligned: > > > > [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":", > > [{return, list}, {parts, 3}]), > > > > The more I looked at this topic, the more it looked like the last one, > > but even more space constrained because of the existing indent of the > > call itself. As such I'm going to roll it into the previous proposal: > > > > REVISED PROPOSAL: "Function definitions *and calls* should align wrapped > > elements using a 2-level hanging indent. There should be no arguments on > > the first line. The closing parenthesis or arrow may be placed on its > > own line if desired, but if so, it should be indented the same number of > > spaces as the function definition or call itself." > > > > That means these would be acceptable: > > > > [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":", > > [{return, list}, {parts, 3}]), > > > > [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":", > > [{return, list}, {parts, 3}] > > ), > > > > ----- > > > > LONG LIST WRAPPING: > > > > 4 for 8-space indent > > 3 for "aligned with nested structure in previous line" > > 5 for "single character indent" > > 9 for "indented to match correct nesting block" > > 3 for "4-space indent" > > 1 for "2 with indented case" > > > > Reminder: You could vote for multiple options for this question. > > > > Here is the code block formatted with single-character indent: > > > > case lists:member(revs, Options) of > > false -> > > []; > > true -> > > [{<<"revisions">>, {[{<<"start">>, Start}, > > {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}]}}] > > end. > > > > And indented to match correct nesting block: > > > > case lists:member(revs, Options) of > > false -> > > []; > > true -> > > [ > > {<<"revisions">>, > > {[{<<"start">>, Start}, > > {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]} > > ]} > > } > > ] > > end. > > > > This was intended to be a question to which there really was no good > > answer. ;) As expected, results are across the board, except for > > "indented to match correct nesting block," which appears to be popular > > because it was probably the only layout one could glance at and have a > > hope of understanding. > > > > I don't think there is a good proposal to be made here. It is a judgment > > call, and I think any of "4-space indent," "8-space indent" or "indented > > to match correct nesting blocks" can be made to work. > > > > This was an good example of a question that was missing my least hated > way of indenting these sorts of things: > > https://gist.github.com/davisp/3da4d4e3a05c60e3a1c8 > > And I really do mean least hated. That method sticks to the Z*Y=X law > of indentation but it can really end up in bunched up list/tuple > closing tokens. But at least you can see the general structure of the > term without going cross-eyed. > > > ----- > > > > LIST COMPREHENSION WRAP: > > > > 9 for "lined up for first term until || is reached > > 3 for "indented 4 spaces from {ok above" > > 2 for "everything indented 8 spaces" > > 1 for "4 spaces from expression start, e.g. after Docs" > > 1 for "Don't use multi-line list comprehensions! 4-space indent" > > 1 for "no idea" :D > > > > Code for "lined up for first term until || is reached": > > > > Docs = [Doc || {ok, Doc} <- [ > > couch_db:open_doc(Db2, DocInfo2, [deleted, > conflicts]) > > || Docinfo2 <- DocInfos]], > > > > This was also a very ugly example that I found in our code that I wanted > > to use to highlight how difficult it can be to come up with a standard. > > The good news is that most people were in the 4- or 8-space camp, i.e. > > 1 or 2 level indents, and that perhaps the code needs refactoring. In > > the case of refactoring, I definitely agree with Bob: PRs with refactors > > should not be combined with PRs for whitespace, or at the very least > > should be 2 separate checkins within the same PR. > > > > There is no unique proposal for this other than to reference the initial > > proposal in this post: "Indent your code blocks with 4 spaces. Never use > > tabs or a mix of tabs and spaces. When additional indentation levels are > > needed, always increment by a multiple of 4 spaces." > > > > This one is particularly insidious. I really hate multi-line list > comprehensions and I especially dislike nested list comphrensions > because of how easy it is to miss them when skimming code. This gist > shows my least hated way for seeing these though I invariably ask for > a refactor: > > https://gist.github.com/davisp/84ac929e3c9bb1422003 > > The insidious part though is that refactor that I added in the gist is > not behavior identical. The rules on silent exception swallowing in > list comprehensions are fairly subtle. I've seen real bugs in > production from undoing list comprehensions exactly like that one. > > > ----- > > > > VERTICAL SPACING: > > > > There was no poll question on this but it was brought up a few times on > > the list. Going from code and proposals, there are 2 options: > > > > 0 blank lines between function declarations differing only in guards > > 1 blank line between different function declarations, imports, etc. > > > > and > > > > 1 blank line between function declarations differing only in guards > > 2 blank lines between different function declarations, imports, etc. > > > > I can see arguments for both. By inspection most of our code follows > > the 0/1 approach, not the 1/2 approach favoured by Paul. > > > > ----- > > Yeah, most of the code does follow the 0/1 approach. But after enough > time working with code like this: > > > https://github.com/apache/couchdb/blob/master/src/couchdb/couch_db_updater.erl#L58-L215 > > I found it really easy to hate on 0 spaces when the whole function > doesn't even fit on my 24" ACD at 10pt font. I won't point out that it > also has 0, 1, and 2 empty lines between different clauses of the same > function. Of course it's easy to say that we should just refactor that > code as well. But I have seen some functions where it doesn't make > sense that still end up going very vertical. > > > One of the best things I think could come out of this would be a tool > that would re-indent source files similar to gofmt or something that > would be the "canonical" indentation format whatever it is. That way > if people really hate whatever is community consensus it doesn't > matter as they can configure Git to run pre-commit hooks for > reformatting or whatever. There's quite a bit of parsing/serializing > stuff in Erlang so I don't think it'd be *too* terribly hard to > implement. > --001a1137d4a4d3ffae04f632025f--