Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
Received-SPF: pass (nike.apache.org: domain of bchesneau@gmail.com designates
 209.85.192.49 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAJ_m3YAZvH7-jMN=_hS8Nuw4Wkm_8099WT8qg6VV7wTQz=7fGg@mail.gmail.com>
References: 
 <CAJNb-9qyv2n_dqNXHZxuf46xj7QSQ4vyQ6GU2YwaTOow0NKheQ@mail.gmail.com>
 <22926798.38.1396589070168.JavaMail.Joan@RITA>
 <CAJ_m3YAZvH7-jMN=_hS8Nuw4Wkm_8099WT8qg6VV7wTQz=7fGg@mail.gmail.com>
From: Benoit Chesneau <bchesneau@gmail.com>
Date: Fri, 4 Apr 2014 08:55:43 +0200
Message-ID: 
 <CAJNb-9ryvTVm0AAMmVPVDXobSXKMaNJQxNB7j0Eq-GPTmd0YHw@mail.gmail.com>
Subject: Re: [DISCUSS] Erlang whitespace standards (was: [POLL])
To: "dev@couchdb.apache.org" <dev@couchdb.apache.org>
Content-Type: multipart/alternative; boundary=001a1137d4a4d3ffae04f632025f

--001a1137d4a4d3ffae04f632025f
Content-Type: text/plain; charset=ISO-8859-1

+1 for a format tool like gofmt. Other than that we may want to provides
vim and erlang syntax (probably others if you,re using other).

But having a tool like gofmt is definitely the best, customising its
software is a lot of time,  not sure I want to do that job / product /
language...

- benoit


On Fri, Apr 4, 2014 at 8:42 AM, Paul Davis <paul.joseph.davis@gmail.com>wrote:

> Quick glance, and my first thought was that  maybe we should link to
> gists for each example. Gmail at least is screwing up the examples. A
> few more minor points inline.
>
> On Fri, Apr 4, 2014 at 12:24 AM, Joan Touzet <joant@atypical.net> wrote:
> > Hi everyone,
> >
> > Time to summarize the results. You can view the results at
> >
> >
> https://docs.google.com/forms/d/1b7KcQGgNbSCZVRwLjrUl5Z6C2TBx8X1btlU5fwrNHpg/viewanalytics
> >
> > but I've included them in this email for ease of review.
> >
> > I'm going to break this up into sections and make some PROPOSALs.  I'd
> > like to get general consensus on this vs. a "lazy" approach.  I don't
> > see this as something where need a unanimous vote but I'd like to get us
> > all agree on something we can live with.
> >
> > As for getting this into the code base - let's not endanger the big
> > merges, but once we have finished them, we should move to these
> > standards piecemeal as we rework each file, as Noah and Jan suggest,
> > unless someone wants to do the busy work and re-indent everything.
> > Hopefully, even with the wait for the merges, this means the standard
> > can be "live" before the end of 2014 ;)
> >
> > I don't cover all topics in here - please feel free to follow the post's
> > format and add additional proposals in follow-ups.
> >
> > Finally, if I say something you disagree with or if I have misinterpreted
> > your response, speak up - it was not intentional!
> >
> > -Joan
> >
> > -----
> >
> > TERMINOLOGY USED:
> >   * "X space indent" means X spaces from the LEFT MARGIN.
> >     It is the ABSOLUTE number of columns of whitespace on a line.
> >
> >   * "Y space standard" means indentations should be multiples
> >     of Y spaces.
> >
> >   * "Z level indent" means Z*Y=X absolute spaces for the indent.
> >     For a 4-space standard, a 2 level indent would mean an 8 space
> >     indent.
> >
> > -----
> >
> > STANDARD: Agree on a 4-space standard for horiz. indentation. Most of
> > the respondents seem to be comfortable with this, likely due to the
> > prevalence of the Python / Ruby / JS 4-space standard.
> >
> > PROPOSAL: "Indent your code blocks with 4 spaces. Never use tabs or a
> > mix of tabs and spaces. When additional indentation levels are needed,
> > always increment by a multiple of 4 spaces."
> >
> > This sets us up to be able to have the same spacing standard across JS,
> > C and other languages we may someday ship.
> >
> > -----
> >
> > LINE LENGTH: 11 votes for 80, 6 votes for 132, 1 for 76.
> >
> > PROPOSAL: "Maximum line length is 80 characters, with a preference for
> > 76 characters or less.  Exception: URLs in comments"
> >
> > -----
> >
> > CASE STATEMENT INDENTATION: 16 in favour of this format, 3 opposed:
> >
> > get_ini_files(Default) ->
> >     case init:get_argument(couch_ini) of
> >         error ->
> >             Default;
> >         {ok, [[]]} ->
> >             Default;
> >         {ok, [Values]} ->
> >             Values
> >     end.
> >
> > This format matches Erlang documentation and is fairly canonical.
> >
> > PROPOSAL: "Indent case pattern clauses 1 level, and each case pattern
> > body 2 levels from the initial case statement."
> >
> > -----
> >
> > CASE STATEMENT ONE-LINERS: 11 in favour, 8 opposed:
> >
> >     case {Name, Pass} of
> >         {"Jan Lehnardt", "apple"} -> ok;
> >         ...
> >
> > The only write-in for this suggested that one-liners needed to fit on a
> > single line "without looking terrible."
> >
> > PROPOSAL: "Generally, case pattern bodies should always start on a new
> > line from their corresponding case pattern clause. However, you can put
> > the clause and body on the same line if the entire statement fits on one
> > line."
> >
> > This is a tough one because it directly contradicts the previous
> > proposal. If people feel strongly I am OK to be more strict and remove
> > "Generally, " and the second sentence from this proposal.
> >
>
> I know I had a write in for this one that was "only if all clauses fit
> on a single line". Not sure if you paraphrased that as "without
> looking terrible" or not. But if I write three clauses as one-liners
> and the fourth doesn't fit (or even if I feel like its just too long)
> then I go back and undo each one so that the case bodies are
> consistenly indented for the given case clause.
>
> > -----
> >
> > LONG FUNCTION CLAUSE:
> >
> > 7 for paren aligned
> > 4 for 2-space indented
> > 5 for 8-space indented
> > 1 for "2 space, but no arguments on the initial line, with
> >        the closing } on its own line"
> > 1 for "4-space indented"
> > 1 for "one tab"
> >
> > As a reminder, here is the code, paren aligned:
> >
> > possibly_embed_doc(#collector{db_name=DbName, query_args=Args),
> >                    #view_row{key=_Key, id=_Id, value=Value,
> doc=_Doc}=Row) ->
> >
> > And 8-space aligned:
> >
> > possibly_embed_doc(
> >         #collector{db_name=DbName, query_args=Args),
> >         #view_row{key=_Key, id=_Id, value=Value, doc=_Doc}=Row) ->
> >
> >
> > Ideology here and on the list is split roughly into 2 camps:
> >
> >   * Z-level indent of a multiple of 4 spaces. As the body of the
> >     function will start at 4 spaces, I am going to recommend
> >     against 1-level and say a 2-level (8 space) indent is the
> >     option here.
> >
> >   * Emacs/paren indentation mode. I believe the big arguments for
> >     this mode is "it's what my editor does" and "it's common in
> >     strongly typed languages." If you feel differently, please
> >     speak up. On the other side, Paul feels strongly about not
> >     adopting this model; Wendall supports it and Bob N. says he
> >     can 'retrain himself' to use it. Notice also that, in this
> >     example, the second line ends on col. 78. Even if the -> was
> >     wrapped to the next line, the line still ends on col. 75.
> >
>
> That's an interesting observation in the strongly typed vs. not
> languages. I can't say I've ever noticed a preference based on the
> type system.
>
> I will say that I don't put much stock into the "that's what my editor
> does" argument. If you're editor is so fancy that it does syntatical
> parsing to adjust indentation widths dynamically then surely its fancy
> enough to have an off switch for that behavior. One of the best
> reasons for using a Z*Y=X approach is that it fits most common
> languages these days with at most a modification of Y based on file
> extension. This coupled with file mode detection for when when indent
> level changes leads to a fairly easy to use editor without making any
> particularly language ridiculously different than another which helps
> for people like me that read and write a number of different languages
> with decent regularity.
>
> > Tough call here. Based on similarity with other popular languages of our
> > day I'm going to initially propose the first option and let anyone who
> > strongly opposes speak up now. There was no strong statement
> > about whether the ) or -> should be on its own line, so I'll leave
> > that part of the proposal vague for now.
> >
> > PROPOSAL: "Function definitions should align wrapped elements using a
> > 2-level hanging indent. There should be no arguments on the first line.
> > The closing parenthesis or arrow may be placed on its own line if
> > desired, but if so, it should be indented the same number of spaces as
> > the function definition itself."  **but see below**
> >
> > -----
> >
> > LONG FUNCTION CALL:
> >
> > 7 for paren-aligned
> > 7 for 4-space indent
> > 3 for 8-space indent
> > 1 for "rework the code, or 4-space indent"
> > 1 for "2 space, but no arguments on the initial line, with
> >        the closing } on its own line"
> >
> > As a reminder, here is the code, paren-aligned:
> >
> >             [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
> >                                      [{return, list}, {parts, 3}]),
> >
> > And 8-space aligned:
> >
> >             [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
> >                     [{return, list}, {parts, 3}]),
> >
> > The more I looked at this topic, the more it looked like the last one,
> > but even more space constrained because of the existing indent of the
> > call itself. As such I'm going to roll it into the previous proposal:
> >
> > REVISED PROPOSAL: "Function definitions *and calls* should align wrapped
> > elements using a 2-level hanging indent. There should be no arguments on
> > the first line. The closing parenthesis or arrow may be placed on its
> > own line if desired, but if so, it should be indented the same number of
> > spaces as the function definition or call itself."
> >
> > That means these would be acceptable:
> >
> >             [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
> >                     [{return, list}, {parts, 3}]),
> >
> >             [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
> >                     [{return, list}, {parts, 3}]
> >             ),
> >
> > -----
> >
> > LONG LIST WRAPPING:
> >
> > 4 for 8-space indent
> > 3 for "aligned with nested structure in previous line"
> > 5 for "single character indent"
> > 9 for "indented to match correct nesting block"
> > 3 for "4-space indent"
> > 1 for "2 with indented case"
> >
> > Reminder: You could vote for multiple options for this question.
> >
> > Here is the code block formatted with single-character indent:
> >
> >     case lists:member(revs, Options) of
> >         false ->
> >             [];
> >         true ->
> >             [{<<"revisions">>, {[{<<"start">>, Start},
> >              {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}]}}]
> >     end.
> >
> > And indented to match correct nesting block:
> >
> >     case lists:member(revs, Options) of
> >         false ->
> >             [];
> >         true ->
> >             [
> >              {<<"revisions">>,
> >               {[{<<"start">>, Start},
> >                 {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}
> >                ]}
> >              }
> >             ]
> >     end.
> >
> > This was intended to be a question to which there really was no good
> > answer. ;) As expected, results are across the board, except for
> > "indented to match correct nesting block," which appears to be popular
> > because it was probably the only layout one could glance at and have a
> > hope of understanding.
> >
> > I don't think there is a good proposal to be made here. It is a judgment
> > call, and I think any of "4-space indent," "8-space indent" or "indented
> > to match correct nesting blocks" can be made to work.
> >
>
> This was an good example of a question that was missing my least hated
> way of indenting these sorts of things:
>
> https://gist.github.com/davisp/3da4d4e3a05c60e3a1c8
>
> And I really do mean least hated. That method sticks to the Z*Y=X law
> of indentation but it can really end up in bunched up list/tuple
> closing tokens. But at least you can see the general structure of the
> term without going cross-eyed.
>
> > -----
> >
> > LIST COMPREHENSION WRAP:
> >
> > 9 for "lined up for first term until || is reached
> > 3 for "indented 4 spaces from {ok above"
> > 2 for "everything indented 8 spaces"
> > 1 for "4 spaces from expression start, e.g. after Docs"
> > 1 for "Don't use multi-line list comprehensions! 4-space indent"
> > 1 for "no idea" :D
> >
> > Code for "lined up for first term until || is reached":
> >
> >             Docs = [Doc || {ok, Doc} <- [
> >                     couch_db:open_doc(Db2, DocInfo2, [deleted,
> conflicts])
> >                         || Docinfo2 <- DocInfos]],
> >
> > This was also a very ugly example that I found in our code that I wanted
> > to use to highlight how difficult it can be to come up with a standard.
> > The good news is that most people were in the 4- or 8-space camp, i.e.
> > 1 or 2 level indents, and that perhaps the code needs refactoring. In
> > the case of refactoring, I definitely agree with Bob: PRs with refactors
> > should not be combined with PRs for whitespace, or at the very least
> > should be 2 separate checkins within the same PR.
> >
> > There is no unique proposal for this other than to reference the initial
> > proposal in this post: "Indent your code blocks with 4 spaces. Never use
> > tabs or a mix of tabs and spaces. When additional indentation levels are
> > needed, always increment by a multiple of 4 spaces."
> >
>
> This one is particularly insidious. I really hate multi-line list
> comprehensions and I especially dislike nested list comphrensions
> because of how easy it is to miss them when skimming code. This gist
> shows my least hated way for seeing these though I invariably ask for
> a refactor:
>
> https://gist.github.com/davisp/84ac929e3c9bb1422003
>
> The insidious part though is that refactor that I added in the gist is
> not behavior identical. The rules on silent exception swallowing in
> list comprehensions are fairly subtle. I've seen real bugs in
> production from undoing list comprehensions exactly like that one.
>
> > -----
> >
> > VERTICAL SPACING:
> >
> > There was no poll question on this but it was brought up a few times on
> > the list. Going from code and proposals, there are 2 options:
> >
> > 0 blank lines between function declarations differing only in guards
> > 1 blank line between different function declarations, imports, etc.
> >
> > and
> >
> > 1 blank line between function declarations differing only in guards
> > 2 blank lines between different function declarations, imports, etc.
> >
> > I can see arguments for both. By inspection most of our code follows
> > the 0/1 approach, not the 1/2 approach favoured by Paul.
> >
> > -----
>
> Yeah, most of the code does follow the 0/1 approach. But after enough
> time working with code like this:
>
>
> https://github.com/apache/couchdb/blob/master/src/couchdb/couch_db_updater.erl#L58-L215
>
> I found it really easy to hate on 0 spaces when the whole function
> doesn't even fit on my 24" ACD at 10pt font. I won't point out that it
> also has 0, 1, and 2 empty lines between different clauses of the same
> function. Of course it's easy to say that we should just refactor that
> code as well. But I have seen some functions where it doesn't make
> sense that still end up going very vertical.
>
>
> One of the best things I think could come out of this would be a tool
> that would re-indent source files similar to gofmt or something that
> would be the "canonical" indentation format whatever it is. That way
> if people really hate whatever is community consensus it doesn't
> matter as they can configure Git to run pre-commit hooks for
> reformatting or whatever. There's quite a bit of parsing/serializing
> stuff in Erlang so I don't think it'd be *too* terribly hard to
> implement.
>

--001a1137d4a4d3ffae04f632025f--