flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: [DISCUSS] Code style / checkstyle
Date Sun, 19 Mar 2017 16:09:45 GMT
I played around with this over the week end and it turns out that the required changes in flink-streaming-java
are not that big. I created a PR with a proposed checkstyle.xml and the required code changes:
https://github.com/apache/flink/pull/3567 <https://github.com/apache/flink/pull/3567>.
There’s a longer description of the style in the PR. The commits add continuously more invasive
changes so we can start with the more lightweight changes if we want to.

If we want to go forward with this I would also encourage other people to use this for different
modules and see how it turns out.

Best,
Aljoscha

> On 18 Mar 2017, at 08:00, Aljoscha Krettek <aljoscha@apache.org> wrote:
> 
> I added an issue for adding a custom checkstyle.xml for flink-streaming-java so that
we can gradually add checks: https://issues.apache.org/jira/browse/FLINK-6107. I outlined
the procedure in the Jira. We can use this as a pilot project and see how it goes and then
gradually also apply to other modules.
> 
> What do you think?
> 
>> On 6 Mar 2017, at 12:42, Stephan Ewen <sewen@apache.org> wrote:
>> 
>> A singular "all reformat in one instant" will cause immense damage to the
>> project, in my opinion.
>> 
>> - There are so many pull requests that we are having a hard time keeping
>> up, and merging will a lot more time intensive.
>> - I personally have many forked branches with WIP features that will
>> probably never go in if the branches become unmergeable. I expect that to
>> be true for many other committers and contributors.
>> - Some companies have Flink forks and are rebasing patches onto master
>> regularly. They will be completely screwed by a full reformat.
>> 
>> If we do something, the only thing that really is possible is:
>> 
>> (1) Define a style. Ideally not too far away from Flink's style.
>> (2) Apply it to new projects/modules
>> (3) Coordinate carefully to pull it into existing modules, module by
>> module. Leaving time to adopt pull requests bit by bit, and allowing forks
>> to go through minor merges, rather than total conflict.
>> 
>> 
>> 
>> On Wed, Mar 1, 2017 at 5:57 PM, Henry Saputra <henry.saputra@gmail.com>
>> wrote:
>> 
>>> It is actually Databricks Scala guide to help contributions to Apache Spark
>>> so not really official Spark Scala guide.
>>> The style guide feels bit more like asking people to write Scala in Java
>>> mode so I am -1 to follow the style for Apache Flink Scala if that what you
>>> are recommending.
>>> 
>>> If the "unification" means ONE style guide for both Java and Scala I would
>>> vote -1 to it because both languages have different semantics and styles to
>>> make them readable and understandable.
>>> 
>>> We could start with improving the Scala maven style guide to follow more
>>> Scala official style guide [1] and add IntelliJ Idea or Eclipse plugin
>>> style to follow suit.
>>> 
>>> Java side has bit more strict style check but we could make it tighter but
>>> embracing more Google Java guide closely with minor exceptions
>>> 
>>> - Henry
>>> 
>>> [1] http://docs.scala-lang.org/style/
>>> 
>>> On Mon, Feb 27, 2017 at 11:54 AM, Stavros Kontopoulos <
>>> st.kontopoulos@gmail.com> wrote:
>>> 
>>>> +1 to provide and enforcing a unified code style for both java and scala.
>>>> Unification should apply when it makes sense like comments though.
>>>> 
>>>> Eventually code base should be re-factored. I would vote for the one at a
>>>> time module fix apporoach.
>>>> Style guide should be part of any PR review.
>>>> 
>>>> We could also have a look at the spark style guide:
>>>> https://github.com/databricks/scala-style-guide
>>>> 
>>>> The style code and general guidelines help keep code more readable and
>>> keep
>>>> things simple
>>>> with many contributors and different styles of code writing + language
>>>> features.
>>>> 
>>>> 
>>>> On Mon, Feb 27, 2017 at 8:01 PM, Stephan Ewen <sewen@apache.org> wrote:
>>>> 
>>>>> I agree, reformatting 90% of the code base is tough.
>>>>> 
>>>>> There are two main issues:
>>>>> (1) Incompatible merges. This is hard, especially for the folks that
>>>> have
>>>>> to merge the pull requests ;-)
>>>>> 
>>>>> (2) Author history: This is less of an issue, I think. "git log
>>>>> <filename>" and "git show <revision> -- <filename>"
will still work and
>>>> one
>>>>> may have to go one commit back to find out why something was changed
>>>>> 
>>>>> 
>>>>> What I could image is to do this incrementally. Define the code style
>>> in
>>>>> "flink-parent" but do not activate it.
>>>>> Then start with some projects (new projects, plus some others):
>>>>> merge/reject PRs, reformat, activate code style.
>>>>> 
>>>>> Piece by piece. This is realistically going to take a long time until
>>> it
>>>> is
>>>>> pulled through all components, but that's okay, I guess.
>>>>> 
>>>>> Stephan
>>>>> 
>>>>> 
>>>>> On Mon, Feb 27, 2017 at 1:53 PM, Aljoscha Krettek <aljoscha@apache.org
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> Just for a bit of context, this is the output of running cloc on
the
>>>>> Flink
>>>>>> codebase:
>>>>>> ------------------------------------------------------------
>>>>>> -----------------------
>>>>>> Language                         files          blank        comment
>>>>>>   code
>>>>>> ------------------------------------------------------------
>>>>>> -----------------------
>>>>>> Java                              4609         126825         185428
>>>>>> 519096
>>>>>> 
>>>>>> => 704,524 lines of code + comments/javadoc
>>>>>> 
>>>>>> When I apply the google style to the Flink code base using
>>>>>> https://github.com/google/google-java-format I get these commit
>>>>>> statistics:
>>>>>> 
>>>>>> 4577 files changed, 647645 insertions(+), 622663 deletions(-)
>>>>>> 
>>>>>> That is, a change to the Google Code Style would touch roughly over
>>> 90%
>>>>> of
>>>>>> all code/comment lines.
>>>>>> 
>>>>>> I would like to have a well defined code style, such as the Google
>>> Code
>>>>>> style, that has nice tooling and support but I don't think we will
>>> ever
>>>>>> convince enough people to do this kind of massive change. Even I
>>> think
>>>>> it's
>>>>>> a bit crazy to change 90% of the code base in one commit.
>>>>>> 
>>>>>> On Mon, 27 Feb 2017 at 11:10 Till Rohrmann <trohrmann@apache.org>
>>>> wrote:
>>>>>> 
>>>>>>> No, I think that's exactly what people mean when saying "losing
the
>>>>>> commit
>>>>>>> history". With the reformatting you would have to go manually
>>> through
>>>>> all
>>>>>>> past commits until you find the commit which changed a given
line
>>>>> before
>>>>>>> the reformatting.
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>> 
>>>>>>> On Sun, Feb 26, 2017 at 6:32 PM, Alexander Alexandrov <
>>>>>>> alexander.s.alexandrov@gmail.com> wrote:
>>>>>>> 
>>>>>>>> Just to clarify - by "losing the commit history" you actually
>>> mean
>>>>>>> "losing
>>>>>>>> the ability to annotate each line in a file with its last
>>> commit",
>>>>>> right?
>>>>>>>> 
>>>>>>>> Or is there some other sense in which something is lost after
>>>>> applying
>>>>>>> bulk
>>>>>>>> re-format?
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> A.
>>>>>>>> 
>>>>>>>> On Sat, Feb 25, 2017 at 7:10 AM Henry Saputra <
>>>>> henry.saputra@gmail.com
>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Just want to clarify what unify code style here.
>>>>>>>>> 
>>>>>>>>> Is the intention to have IDE and Maven plugins to have
the same
>>>>> check
>>>>>>>> style
>>>>>>>>> rules?
>>>>>>>>> 
>>>>>>>>> Or are we talking about having ONE code style for both
Java and
>>>>>> Scala?
>>>>>>>>> 
>>>>>>>>> - Henry
>>>>>>>>> 
>>>>>>>>> On Fri, Feb 24, 2017 at 8:08 AM, Greg Hogan <
>>> code@greghogan.com>
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> I agree wholeheartedly with Ufuk. We cannot reformat
the
>>>>> codebase,
>>>>>>>> cannot
>>>>>>>>>> pause while flushing the PR queue, and won't find
a consensus
>>>>> code
>>>>>>>> style.
>>>>>>>>>> 
>>>>>>>>>> I think we can create a baseline code style for new
and
>>>> existing
>>>>>>>>>> contributors for which reformatting on changed files
will be
>>>>>>> acceptable
>>>>>>>>> for
>>>>>>>>>> PR reviews.
>>>>>>>>>> 
>>>>>>>>>> On Fri, Feb 24, 2017 at 5:01 AM, Dawid Wysakowicz
<
>>>>>>>>>> wysakowicz.dawid@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> The problem with code style when it is not enforced
is that
>>>> it
>>>>>> will
>>>>>>>> be
>>>>>>>>> a
>>>>>>>>>>> matter of luck to what parts of files / new files
will it
>>> be
>>>>>>> applied.
>>>>>>>>>> When
>>>>>>>>>>> the code style is not applied to whole file,
it is pretty
>>>> much
>>>>>>>> useless
>>>>>>>>>>> anyway. You would need to manually select just
the
>>> fragments
>>>>> one
>>>>>> is
>>>>>>>>>>> changing. The benefits of having code style and
enforcing
>>> it
>>>> I
>>>>>> see
>>>>>>>> are:
>>>>>>>>>>> - being able to apply autoformatter, which speeds
up
>>> writing
>>>>>> code
>>>>>>>>>>> - it would make reviewing PRs easier as e.g.
there would
>>> be
>>>>> line
>>>>>>>>> length
>>>>>>>>>>> limit applied etc. which will make line breaking
more
>>> reader
>>>>>>>> friendly.
>>>>>>>>>>> 
>>>>>>>>>>> Though I think if a consensus is not reachable
it would be
>>>> good
>>>>>> to
>>>>>>>> once
>>>>>>>>>> and
>>>>>>>>>>> forever decide that we don't want a code style
and
>>>> checkstyle.
>>>>>>>>>>> 
>>>>>>>>>>> 2017-02-24 10:51 GMT+01:00 Ufuk Celebi <uce@apache.org>:
>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, Feb 24, 2017 at 10:46 AM, Fabian
Hueske <
>>>>>>> fhueske@gmail.com
>>>>>>>>> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>>> I agree with Till that encouraging a
code style without
>>>>>>> enforcing
>>>>>>>>> it
>>>>>>>>>>> does
>>>>>>>>>>>>> not make a lot of sense.
>>>>>>>>>>>>> If we enforce it, we need to touch all
files and PRs.
>>>>>>>>>>>> 
>>>>>>>>>>>> I think it makes sense for new contributors
to have a
>>>>> starting
>>>>>>>> point
>>>>>>>>>>>> without enforcing anything (I do agree that
we are past
>>> the
>>>>>> point
>>>>>>>> to
>>>>>>>>>>>> reach consensus on a style and enforcement
;-)).
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message