Mailing-List: contact dev-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@couchdb.apache.org
Received-SPF: pass (nike.apache.org: domain of paul.joseph.davis@gmail.com
 designates 209.85.214.173 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <485EF20C-42F8-4D0F-9861-770C181DD6CE@apache.org>
References: 
 <CAJNb-9qyv2n_dqNXHZxuf46xj7QSQ4vyQ6GU2YwaTOow0NKheQ@mail.gmail.com>
 <22926798.38.1396589070168.JavaMail.Joan@RITA>
 <CAHdjipJQ-fe+eXSQiE8KU1td60mc3Tcr56hTzhna94ea=p6mtA@mail.gmail.com>
 <485EF20C-42F8-4D0F-9861-770C181DD6CE@apache.org>
From: Paul Davis <paul.joseph.davis@gmail.com>
Date: Fri, 4 Apr 2014 16:08:15 -0500
Message-ID: 
 <CAJ_m3YBhL2uLv2pic1aZV7MrLf+A91gVLL7JA6o6ScFhv-2Q=g@mail.gmail.com>
Subject: Re: [DISCUSS] Erlang whitespace standards (was: [POLL])
To: "dev@couchdb.apache.org" <dev@couchdb.apache.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

I definitely agree we should re-format the whole code base any time
soon. Though at some point it'd be a good idea. Hopefully we can find
a lull after the two big forks are merged where we can just have a
commit on each Erlang repo to do the deed while there's no large
outstanding work that'd be super difficult to merge.

On Fri, Apr 4, 2014 at 9:33 AM, Robert Samuel Newson <rnewson@apache.org> w=
rote:
> I appreciate firming up a consensus on indentation styles but I want to b=
e clearly -1 on a codebase-wide reformatting for the foreseeable future. Be=
yond the merges, we have active branches for older releases, the more refor=
matting we do, the harder back- and forward-porting becomes. I like the ide=
a of being more consistent for future work and, where code is particularly =
crufty, refactoring before making a change. The "worst" formatted code in c=
ouchdb is generally the oldest, and much of that needs a refactor/rewrite a=
s we get to it.
>
> B.
>
> On 4 Apr 2014, at 14:07, Alexander Shorin <kxepal@gmail.com> wrote:
>
>> Hi Joan and all,
>>
>> I just faced another indention case which left out of scope of the vote:
>> https://gist.github.com/kxepal/2c09fb5348ead90bea04
>>
>> Personally, I'm for 1) variant there.
>>
>> Another interesting case is anonymous function:
>> https://gist.github.com/kxepal/c5480209af9e93a14155
>>
>> I prefer 3) one.
>>
>> What would be your recommendations there about?
>>
>> --
>> ,,,^..^,,,
>>
>>
>> On Fri, Apr 4, 2014 at 9:24 AM, Joan Touzet <joant@atypical.net> wrote:
>>> Hi everyone,
>>>
>>> Time to summarize the results. You can view the results at
>>>
>>> https://docs.google.com/forms/d/1b7KcQGgNbSCZVRwLjrUl5Z6C2TBx8X1btlU5fw=
rNHpg/viewanalytics
>>>
>>> but I've included them in this email for ease of review.
>>>
>>> I'm going to break this up into sections and make some PROPOSALs.  I'd
>>> like to get general consensus on this vs. a "lazy" approach.  I don't
>>> see this as something where need a unanimous vote but I'd like to get u=
s
>>> all agree on something we can live with.
>>>
>>> As for getting this into the code base - let's not endanger the big
>>> merges, but once we have finished them, we should move to these
>>> standards piecemeal as we rework each file, as Noah and Jan suggest,
>>> unless someone wants to do the busy work and re-indent everything.
>>> Hopefully, even with the wait for the merges, this means the standard
>>> can be "live" before the end of 2014 ;)
>>>
>>> I don't cover all topics in here - please feel free to follow the post'=
s
>>> format and add additional proposals in follow-ups.
>>>
>>> Finally, if I say something you disagree with or if I have misinterpret=
ed
>>> your response, speak up - it was not intentional!
>>>
>>> -Joan
>>>
>>> -----
>>>
>>> TERMINOLOGY USED:
>>>  * "X space indent" means X spaces from the LEFT MARGIN.
>>>    It is the ABSOLUTE number of columns of whitespace on a line.
>>>
>>>  * "Y space standard" means indentations should be multiples
>>>    of Y spaces.
>>>
>>>  * "Z level indent" means Z*Y=3DX absolute spaces for the indent.
>>>    For a 4-space standard, a 2 level indent would mean an 8 space
>>>    indent.
>>>
>>> -----
>>>
>>> STANDARD: Agree on a 4-space standard for horiz. indentation. Most of
>>> the respondents seem to be comfortable with this, likely due to the
>>> prevalence of the Python / Ruby / JS 4-space standard.
>>>
>>> PROPOSAL: "Indent your code blocks with 4 spaces. Never use tabs or a
>>> mix of tabs and spaces. When additional indentation levels are needed,
>>> always increment by a multiple of 4 spaces."
>>>
>>> This sets us up to be able to have the same spacing standard across JS,
>>> C and other languages we may someday ship.
>>>
>>> -----
>>>
>>> LINE LENGTH: 11 votes for 80, 6 votes for 132, 1 for 76.
>>>
>>> PROPOSAL: "Maximum line length is 80 characters, with a preference for
>>> 76 characters or less.  Exception: URLs in comments"
>>>
>>> -----
>>>
>>> CASE STATEMENT INDENTATION: 16 in favour of this format, 3 opposed:
>>>
>>> get_ini_files(Default) ->
>>>    case init:get_argument(couch_ini) of
>>>        error ->
>>>            Default;
>>>        {ok, [[]]} ->
>>>            Default;
>>>        {ok, [Values]} ->
>>>            Values
>>>    end.
>>>
>>> This format matches Erlang documentation and is fairly canonical.
>>>
>>> PROPOSAL: "Indent case pattern clauses 1 level, and each case pattern
>>> body 2 levels from the initial case statement."
>>>
>>> -----
>>>
>>> CASE STATEMENT ONE-LINERS: 11 in favour, 8 opposed:
>>>
>>>    case {Name, Pass} of
>>>        {"Jan Lehnardt", "apple"} -> ok;
>>>        ...
>>>
>>> The only write-in for this suggested that one-liners needed to fit on a
>>> single line "without looking terrible."
>>>
>>> PROPOSAL: "Generally, case pattern bodies should always start on a new
>>> line from their corresponding case pattern clause. However, you can put
>>> the clause and body on the same line if the entire statement fits on on=
e
>>> line."
>>>
>>> This is a tough one because it directly contradicts the previous
>>> proposal. If people feel strongly I am OK to be more strict and remove
>>> "Generally, " and the second sentence from this proposal.
>>>
>>> -----
>>>
>>> LONG FUNCTION CLAUSE:
>>>
>>> 7 for paren aligned
>>> 4 for 2-space indented
>>> 5 for 8-space indented
>>> 1 for "2 space, but no arguments on the initial line, with
>>>       the closing } on its own line"
>>> 1 for "4-space indented"
>>> 1 for "one tab"
>>>
>>> As a reminder, here is the code, paren aligned:
>>>
>>> possibly_embed_doc(#collector{db_name=3DDbName, query_args=3DArgs),
>>>                   #view_row{key=3D_Key, id=3D_Id, value=3DValue, doc=3D=
_Doc}=3DRow) ->
>>>
>>> And 8-space aligned:
>>>
>>> possibly_embed_doc(
>>>        #collector{db_name=3DDbName, query_args=3DArgs),
>>>        #view_row{key=3D_Key, id=3D_Id, value=3DValue, doc=3D_Doc}=3DRow=
) ->
>>>
>>>
>>> Ideology here and on the list is split roughly into 2 camps:
>>>
>>>  * Z-level indent of a multiple of 4 spaces. As the body of the
>>>    function will start at 4 spaces, I am going to recommend
>>>    against 1-level and say a 2-level (8 space) indent is the
>>>    option here.
>>>
>>>  * Emacs/paren indentation mode. I believe the big arguments for
>>>    this mode is "it's what my editor does" and "it's common in
>>>    strongly typed languages." If you feel differently, please
>>>    speak up. On the other side, Paul feels strongly about not
>>>    adopting this model; Wendall supports it and Bob N. says he
>>>    can 'retrain himself' to use it. Notice also that, in this
>>>    example, the second line ends on col. 78. Even if the -> was
>>>    wrapped to the next line, the line still ends on col. 75.
>>>
>>> Tough call here. Based on similarity with other popular languages of ou=
r
>>> day I'm going to initially propose the first option and let anyone who
>>> strongly opposes speak up now. There was no strong statement
>>> about whether the ) or -> should be on its own line, so I'll leave
>>> that part of the proposal vague for now.
>>>
>>> PROPOSAL: "Function definitions should align wrapped elements using a
>>> 2-level hanging indent. There should be no arguments on the first line.
>>> The closing parenthesis or arrow may be placed on its own line if
>>> desired, but if so, it should be indented the same number of spaces as
>>> the function definition itself."  **but see below**
>>>
>>> -----
>>>
>>> LONG FUNCTION CALL:
>>>
>>> 7 for paren-aligned
>>> 7 for 4-space indent
>>> 3 for 8-space indent
>>> 1 for "rework the code, or 4-space indent"
>>> 1 for "2 space, but no arguments on the initial line, with
>>>       the closing } on its own line"
>>>
>>> As a reminder, here is the code, paren-aligned:
>>>
>>>            [_A, _B, _Cs] =3D re:split(?b2l(AuthSession), ":",
>>>                                     [{return, list}, {parts, 3}]),
>>>
>>> And 8-space aligned:
>>>
>>>            [_A, _B, _Cs] =3D re:split(?b2l(AuthSession), ":",
>>>                    [{return, list}, {parts, 3}]),
>>>
>>> The more I looked at this topic, the more it looked like the last one,
>>> but even more space constrained because of the existing indent of the
>>> call itself. As such I'm going to roll it into the previous proposal:
>>>
>>> REVISED PROPOSAL: "Function definitions *and calls* should align wrappe=
d
>>> elements using a 2-level hanging indent. There should be no arguments o=
n
>>> the first line. The closing parenthesis or arrow may be placed on its
>>> own line if desired, but if so, it should be indented the same number o=
f
>>> spaces as the function definition or call itself."
>>>
>>> That means these would be acceptable:
>>>
>>>            [_A, _B, _Cs] =3D re:split(?b2l(AuthSession), ":",
>>>                    [{return, list}, {parts, 3}]),
>>>
>>>            [_A, _B, _Cs] =3D re:split(?b2l(AuthSession), ":",
>>>                    [{return, list}, {parts, 3}]
>>>            ),
>>>
>>> -----
>>>
>>> LONG LIST WRAPPING:
>>>
>>> 4 for 8-space indent
>>> 3 for "aligned with nested structure in previous line"
>>> 5 for "single character indent"
>>> 9 for "indented to match correct nesting block"
>>> 3 for "4-space indent"
>>> 1 for "2 with indented case"
>>>
>>> Reminder: You could vote for multiple options for this question.
>>>
>>> Here is the code block formatted with single-character indent:
>>>
>>>    case lists:member(revs, Options) of
>>>        false ->
>>>            [];
>>>        true ->
>>>            [{<<"revisions">>, {[{<<"start">>, Start},
>>>             {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}]}}]
>>>    end.
>>>
>>> And indented to match correct nesting block:
>>>
>>>    case lists:member(revs, Options) of
>>>        false ->
>>>            [];
>>>        true ->
>>>            [
>>>             {<<"revisions">>,
>>>              {[{<<"start">>, Start},
>>>                {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}
>>>               ]}
>>>             }
>>>            ]
>>>    end.
>>>
>>> This was intended to be a question to which there really was no good
>>> answer. ;) As expected, results are across the board, except for
>>> "indented to match correct nesting block," which appears to be popular
>>> because it was probably the only layout one could glance at and have a
>>> hope of understanding.
>>>
>>> I don't think there is a good proposal to be made here. It is a judgmen=
t
>>> call, and I think any of "4-space indent," "8-space indent" or "indente=
d
>>> to match correct nesting blocks" can be made to work.
>>>
>>> -----
>>>
>>> LIST COMPREHENSION WRAP:
>>>
>>> 9 for "lined up for first term until || is reached
>>> 3 for "indented 4 spaces from {ok above"
>>> 2 for "everything indented 8 spaces"
>>> 1 for "4 spaces from expression start, e.g. after Docs"
>>> 1 for "Don't use multi-line list comprehensions! 4-space indent"
>>> 1 for "no idea" :D
>>>
>>> Code for "lined up for first term until || is reached":
>>>
>>>            Docs =3D [Doc || {ok, Doc} <- [
>>>                    couch_db:open_doc(Db2, DocInfo2, [deleted, conflicts=
])
>>>                        || Docinfo2 <- DocInfos]],
>>>
>>> This was also a very ugly example that I found in our code that I wante=
d
>>> to use to highlight how difficult it can be to come up with a standard.
>>> The good news is that most people were in the 4- or 8-space camp, i.e.
>>> 1 or 2 level indents, and that perhaps the code needs refactoring. In
>>> the case of refactoring, I definitely agree with Bob: PRs with refactor=
s
>>> should not be combined with PRs for whitespace, or at the very least
>>> should be 2 separate checkins within the same PR.
>>>
>>> There is no unique proposal for this other than to reference the initia=
l
>>> proposal in this post: "Indent your code blocks with 4 spaces. Never us=
e
>>> tabs or a mix of tabs and spaces. When additional indentation levels ar=
e
>>> needed, always increment by a multiple of 4 spaces."
>>>
>>> -----
>>>
>>> VERTICAL SPACING:
>>>
>>> There was no poll question on this but it was brought up a few times on
>>> the list. Going from code and proposals, there are 2 options:
>>>
>>> 0 blank lines between function declarations differing only in guards
>>> 1 blank line between different function declarations, imports, etc.
>>>
>>> and
>>>
>>> 1 blank line between function declarations differing only in guards
>>> 2 blank lines between different function declarations, imports, etc.
>>>
>>> I can see arguments for both. By inspection most of our code follows
>>> the 0/1 approach, not the 1/2 approach favoured by Paul.
>>>
>>> -----
>