couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joan Touzet <jo...@atypical.net>
Subject [DISCUSS] Erlang whitespace standards (was: [POLL])
Date Fri, 04 Apr 2014 05:24:32 GMT
Hi everyone,

Time to summarize the results. You can view the results at

https://docs.google.com/forms/d/1b7KcQGgNbSCZVRwLjrUl5Z6C2TBx8X1btlU5fwrNHpg/viewanalytics

but I've included them in this email for ease of review.

I'm going to break this up into sections and make some PROPOSALs.  I'd
like to get general consensus on this vs. a "lazy" approach.  I don't
see this as something where need a unanimous vote but I'd like to get us
all agree on something we can live with.

As for getting this into the code base - let's not endanger the big
merges, but once we have finished them, we should move to these
standards piecemeal as we rework each file, as Noah and Jan suggest,
unless someone wants to do the busy work and re-indent everything.
Hopefully, even with the wait for the merges, this means the standard
can be "live" before the end of 2014 ;)

I don't cover all topics in here - please feel free to follow the post's
format and add additional proposals in follow-ups.

Finally, if I say something you disagree with or if I have misinterpreted
your response, speak up - it was not intentional!

-Joan

-----

TERMINOLOGY USED:
  * "X space indent" means X spaces from the LEFT MARGIN.
    It is the ABSOLUTE number of columns of whitespace on a line.

  * "Y space standard" means indentations should be multiples
    of Y spaces.

  * "Z level indent" means Z*Y=X absolute spaces for the indent.
    For a 4-space standard, a 2 level indent would mean an 8 space
    indent.

-----

STANDARD: Agree on a 4-space standard for horiz. indentation. Most of
the respondents seem to be comfortable with this, likely due to the
prevalence of the Python / Ruby / JS 4-space standard.

PROPOSAL: "Indent your code blocks with 4 spaces. Never use tabs or a
mix of tabs and spaces. When additional indentation levels are needed,
always increment by a multiple of 4 spaces."

This sets us up to be able to have the same spacing standard across JS,
C and other languages we may someday ship.

-----

LINE LENGTH: 11 votes for 80, 6 votes for 132, 1 for 76.

PROPOSAL: "Maximum line length is 80 characters, with a preference for
76 characters or less.  Exception: URLs in comments"

-----

CASE STATEMENT INDENTATION: 16 in favour of this format, 3 opposed:

get_ini_files(Default) ->
    case init:get_argument(couch_ini) of
        error ->
            Default;
        {ok, [[]]} ->
            Default;
        {ok, [Values]} ->
            Values
    end.

This format matches Erlang documentation and is fairly canonical.

PROPOSAL: "Indent case pattern clauses 1 level, and each case pattern
body 2 levels from the initial case statement." 

-----

CASE STATEMENT ONE-LINERS: 11 in favour, 8 opposed:

    case {Name, Pass} of
        {"Jan Lehnardt", "apple"} -> ok;
        ...

The only write-in for this suggested that one-liners needed to fit on a
single line "without looking terrible."

PROPOSAL: "Generally, case pattern bodies should always start on a new
line from their corresponding case pattern clause. However, you can put
the clause and body on the same line if the entire statement fits on one
line."

This is a tough one because it directly contradicts the previous
proposal. If people feel strongly I am OK to be more strict and remove
"Generally, " and the second sentence from this proposal.

-----

LONG FUNCTION CLAUSE: 

7 for paren aligned
4 for 2-space indented
5 for 8-space indented
1 for "2 space, but no arguments on the initial line, with
       the closing } on its own line"
1 for "4-space indented"
1 for "one tab"

As a reminder, here is the code, paren aligned:

possibly_embed_doc(#collector{db_name=DbName, query_args=Args),
                   #view_row{key=_Key, id=_Id, value=Value, doc=_Doc}=Row) ->

And 8-space aligned:

possibly_embed_doc(
        #collector{db_name=DbName, query_args=Args),
        #view_row{key=_Key, id=_Id, value=Value, doc=_Doc}=Row) ->


Ideology here and on the list is split roughly into 2 camps:

  * Z-level indent of a multiple of 4 spaces. As the body of the
    function will start at 4 spaces, I am going to recommend
    against 1-level and say a 2-level (8 space) indent is the
    option here.

  * Emacs/paren indentation mode. I believe the big arguments for
    this mode is "it's what my editor does" and "it's common in
    strongly typed languages." If you feel differently, please
    speak up. On the other side, Paul feels strongly about not
    adopting this model; Wendall supports it and Bob N. says he
    can 'retrain himself' to use it. Notice also that, in this
    example, the second line ends on col. 78. Even if the -> was
    wrapped to the next line, the line still ends on col. 75.

Tough call here. Based on similarity with other popular languages of our
day I'm going to initially propose the first option and let anyone who
strongly opposes speak up now. There was no strong statement
about whether the ) or -> should be on its own line, so I'll leave
that part of the proposal vague for now.

PROPOSAL: "Function definitions should align wrapped elements using a
2-level hanging indent. There should be no arguments on the first line.
The closing parenthesis or arrow may be placed on its own line if
desired, but if so, it should be indented the same number of spaces as
the function definition itself."  **but see below**

-----

LONG FUNCTION CALL:

7 for paren-aligned
7 for 4-space indent
3 for 8-space indent
1 for "rework the code, or 4-space indent"
1 for "2 space, but no arguments on the initial line, with
       the closing } on its own line"

As a reminder, here is the code, paren-aligned:

            [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
                                     [{return, list}, {parts, 3}]),

And 8-space aligned:

            [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
                    [{return, list}, {parts, 3}]),

The more I looked at this topic, the more it looked like the last one,
but even more space constrained because of the existing indent of the
call itself. As such I'm going to roll it into the previous proposal:

REVISED PROPOSAL: "Function definitions *and calls* should align wrapped
elements using a 2-level hanging indent. There should be no arguments on
the first line. The closing parenthesis or arrow may be placed on its
own line if desired, but if so, it should be indented the same number of
spaces as the function definition or call itself."

That means these would be acceptable:

            [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
                    [{return, list}, {parts, 3}]),

            [_A, _B, _Cs] = re:split(?b2l(AuthSession), ":",
                    [{return, list}, {parts, 3}]
            ),

-----

LONG LIST WRAPPING:

4 for 8-space indent
3 for "aligned with nested structure in previous line"
5 for "single character indent"
9 for "indented to match correct nesting block"
3 for "4-space indent"
1 for "2 with indented case"

Reminder: You could vote for multiple options for this question.

Here is the code block formatted with single-character indent:

    case lists:member(revs, Options) of
        false ->
            [];
        true ->
            [{<<"revisions">>, {[{<<"start">>, Start},
             {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}]}}]
    end.

And indented to match correct nesting block:

    case lists:member(revs, Options) of
        false ->
            [];
        true ->
            [
             {<<"revisions">>,
              {[{<<"start">>, Start},
                {<<"ids">>, [revid_to_str(R) ||R ,_ RevIds]}
               ]}
             }
            ]
    end.

This was intended to be a question to which there really was no good
answer. ;) As expected, results are across the board, except for
"indented to match correct nesting block," which appears to be popular
because it was probably the only layout one could glance at and have a
hope of understanding.

I don't think there is a good proposal to be made here. It is a judgment
call, and I think any of "4-space indent," "8-space indent" or "indented
to match correct nesting blocks" can be made to work.

-----

LIST COMPREHENSION WRAP:

9 for "lined up for first term until || is reached
3 for "indented 4 spaces from {ok above"
2 for "everything indented 8 spaces"
1 for "4 spaces from expression start, e.g. after Docs"
1 for "Don't use multi-line list comprehensions! 4-space indent"
1 for "no idea" :D

Code for "lined up for first term until || is reached":

            Docs = [Doc || {ok, Doc} <- [
                    couch_db:open_doc(Db2, DocInfo2, [deleted, conflicts])
                        || Docinfo2 <- DocInfos]],

This was also a very ugly example that I found in our code that I wanted
to use to highlight how difficult it can be to come up with a standard.
The good news is that most people were in the 4- or 8-space camp, i.e.
1 or 2 level indents, and that perhaps the code needs refactoring. In
the case of refactoring, I definitely agree with Bob: PRs with refactors
should not be combined with PRs for whitespace, or at the very least
should be 2 separate checkins within the same PR.

There is no unique proposal for this other than to reference the initial
proposal in this post: "Indent your code blocks with 4 spaces. Never use
tabs or a mix of tabs and spaces. When additional indentation levels are
needed, always increment by a multiple of 4 spaces."

-----

VERTICAL SPACING:

There was no poll question on this but it was brought up a few times on
the list. Going from code and proposals, there are 2 options:

0 blank lines between function declarations differing only in guards
1 blank line between different function declarations, imports, etc.

and

1 blank line between function declarations differing only in guards
2 blank lines between different function declarations, imports, etc.

I can see arguments for both. By inspection most of our code follows
the 0/1 approach, not the 1/2 approach favoured by Paul.

-----

Mime
View raw message