flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ufuk Celebi <...@apache.org>
Subject Re: [DISCUSS] A more thorough Pull Request check list and template
Date Tue, 25 Jul 2017 07:13:37 GMT
Great! :-) If Fabian is also OK with trying it out, I would ask
Stephan to open a PR for this against Flink.


On Tue, Jul 25, 2017 at 8:50 AM, Chesnay Schepler <chesnay@apache.org> wrote:
> I'm still apprehensive about it, but don't mind trying it out. It's not like
> it can break something.
>
>
> On 24.07.2017 18:52, Ufuk Celebi wrote:
>>
>> What's the conclusion of last weeks discussion here?
>>
>> Fabian and Chesnay raised concerns about the introductory text. Are
>> you still concerned?
>>
>> On Wed, Jul 19, 2017 at 10:04 AM, Stephan Ewen <sewen@apache.org> wrote:
>>>
>>> @Chesnay:
>>>
>>> Put text into template => contributor will have to read it
>>> Put link to text into template => most contributors will ignore the link
>>>
>>> Yes, that is pretty much what my observation from the past is.
>>>
>>>
>>>
>>> On Tue, Jul 18, 2017 at 11:03 PM, Chesnay Schepler <chesnay@apache.org>
>>> wrote:
>>>
>>>> I'm sorry but i can't follow your logic.
>>>>
>>>> Put text into template => contributor will definitely read it
>>>> Put link to text into template => contributor will completely ignore the
>>>> link
>>>>
>>>> The advantage of the link is we don't duplicate the contribution guide
>>>> in
>>>> the docs and in the template.
>>>> Furthermore, you don't even need to remove something from the template,
>>>> since it's just a single line.
>>>>
>>>>
>>>> On 18.07.2017 19:25, Stephan Ewen wrote:
>>>>
>>>>> Concerning moving text to the contributors guide:
>>>>>
>>>>> I can only say it again: I believe the contribution guide is almost
>>>>> dead
>>>>> text. Very few people read it.
>>>>> Before the current template was introduced, new contributors rarely
>>>>> gave
>>>>> the pull request a name with Jira number. That is a good indicator
>>>>> about
>>>>> how many read this guide.
>>>>> Putting the test in the template is a way that every one reads it.
>>>>>
>>>>>
>>>>> I am also wondering what the concern is.
>>>>> A new contributor should clearly read through a bit of text, to learn
>>>>> what
>>>>> we look for in contributions.
>>>>> A recurring contributor will not have to read it again, simply remove
>>>>> the
>>>>> text from the pull request message and go on.
>>>>>
>>>>> Where is the disadvantage?
>>>>>
>>>>>
>>>>> On Tue, Jul 18, 2017 at 5:35 PM, Nico Kruber <nico@data-artisans.com>
>>>>> wrote:
>>>>>
>>>>> I like the new template but also agree with the text being too long and
>>>>>>
>>>>>> would
>>>>>> move the intro to the contributors guide with a link in the PR
>>>>>> template.
>>>>>>
>>>>>> Regarding the questions to fill out - I'd like the headings to be
>>>>>> short
>>>>>> and
>>>>>> have the affected components last so that documentation is not lost
>>>>>> (although
>>>>>> being more important than this checklist), e.g.:
>>>>>>
>>>>>> * Purpose of the change
>>>>>> * Brief change log
>>>>>> * Verifying the change
>>>>>> * Documentation
>>>>>> * Affected components
>>>>>>
>>>>>> The verification options in the original template look a bit too
large
>>>>>> but
>>>>>> it
>>>>>> stresses what tests should be added, especially for bigger changes.
>>>>>> Can't
>>>>>> think of a way to make it shorter though.
>>>>>>
>>>>>>
>>>>>> Nico
>>>>>>
>>>>>> On Tuesday, 18 July 2017 11:20:41 CEST Chesnay Schepler wrote:
>>>>>>
>>>>>>> I fully agree with Fabian.
>>>>>>>
>>>>>>> Multiple-choice questions provide little value to the reviewer,
since
>>>>>>> the
>>>>>>> validity has to be verified in any case. While text answers have
to
>>>>>>> be
>>>>>>> validated as well,
>>>>>>> they give some hint to the reviewer as to how it can be verified
and
>>>>>>> which steps the
>>>>>>> contributor did to do so.
>>>>>>>
>>>>>>> I also agree that it is too long; IMO this is really intimidating
to
>>>>>>> new
>>>>>>> contributors to be greeted with this.
>>>>>>>
>>>>>>> Ideally we only link to the contributors guide and ask 3 questions:
>>>>>>>
>>>>>>>     * What is the problem?
>>>>>>>     * How was it fixed?
>>>>>>>     * How can the fix be verified?
>>>>>>>
>>>>>>> On 18.07.2017 10:47, Fabian Hueske wrote:
>>>>>>>
>>>>>>>> I like the sections about purpose, change log, and verification
of
>>>>>>>> the
>>>>>>>> changes.
>>>>>>>>
>>>>>>>> However, I think the proposed template is too much text.
This is
>>>>>>>>
>>>>>>> probably
>>>>>>> the reason why the first attempt to establish a PR template failed.
>>>>>>>>
>>>>>>>> I would move most of the introduction and explanations incl.
>>>>>>>> examples
>>>>>>>>
>>>>>>> to
>>>>>>> the "Contribution Guidelines" and only pass a link.
>>>>>>>>
>>>>>>>> IMO, the template should be rather shorter than the current
one and
>>>>>>>>
>>>>>>> only
>>>>>>> have the link, the sections to fill out, and checkboxes.
>>>>>>>>
>>>>>>>> I'm also not sure how much the detailed questions will help.
>>>>>>>> For example even if the question about changed dependencies
is
>>>>>>>> answered
>>>>>>>> with "no", the reviewer still has to check that.
>>>>>>>>
>>>>>>>> I think the questions of the current template work differently.
>>>>>>>> A question "Does the PR include tests?" suggests to the contributor
>>>>>>>>
>>>>>>> that
>>>>>>> those should be included. Same for documentation.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Fabian
>>>>>>>>
>>>>>>>> 2017-07-18 10:05 GMT+02:00 Tzu-Li (Gordon) Tai
>>>>>>>> <tzulitai@apache.org>:
>>>>>>>>
>>>>>>>>> +1, I like this a lot.
>>>>>>>>> With the previous template, it doesn’t really resonate
with what we
>>>>>>>>> should
>>>>>>>>> care about, and therefore most of the time I think contributors
>>>>>>>>> just
>>>>>>>>> delete
>>>>>>>>> that template and write down something on their own.
>>>>>>>>>
>>>>>>>>> I would also like to add: “Savepoint / checkpoint binary
formats”
>>>>>>>>> to
>>>>>>>>>
>>>>>>>> the
>>>>>>>
>>>>>>> potential affect scope check list.
>>>>>>>>>
>>>>>>>>> I think changes to those deserves an independent yes
/ no check of
>>>>>>>>> its
>>>>>>>>> own.
>>>>>>>>>
>>>>>>>>> Cheers,
>>>>>>>>> Gordon
>>>>>>>>>
>>>>>>>>> On 18 July 2017 at 3:49:42 PM, Ufuk Celebi (uce@apache.org)
wrote:
>>>>>>>>>
>>>>>>>>> I really like this and vote to change our current template.
>>>>>>>>>
>>>>>>>>> The simple yes/no/... options are a really good idea.
I would add
>>>>>>>>> to
>>>>>>>>> your email that the questions will equally help reviewers
to
>>>>>>>>> remember
>>>>>>>>> to look at these things, which is just as important.
>>>>>>>>>
>>>>>>>>> When we merge this, we should make sure to strictly follow
the
>>>>>>>>> guide.
>>>>>>>>> Ideally, in the long term we can even automate some of
the
>>>>>>>>> yes/no/...
>>>>>>>>> questions via a bot... but let's not get ahead of ourselves
here
>>>>>>>>> ;-)
>>>>>>>>>
>>>>>>>>> On Mon, Jul 17, 2017 at 6:36 PM, Stephan Ewen <sewen@apache.org>
>>>>>>>>>
>>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi all!
>>>>>>>>>>
>>>>>>>>>> I have reflected a bit on the pull requests and on
some of the
>>>>>>>>>> recent
>>>>>>>>>> changes to Flink and some of the introduced bugs
/ regressions
>>>>>>>>>> that
>>>>>>>>>>
>>>>>>>>> we
>>>>>>>
>>>>>>> have
>>>>>>>>>
>>>>>>>>> fixed.
>>>>>>>>>>
>>>>>>>>>> One thing that I think would have helped is to have
more explicit
>>>>>>>>>> information about what the pull request does and
how the
>>>>>>>>>> contributor
>>>>>>>>>>
>>>>>>>>> would
>>>>>>>>>
>>>>>>>>> suggest to verify it. I have seen this when contributing
to some
>>>>>>>>> other
>>>>>>>
>>>>>>> project and really liked the approach.
>>>>>>>>>>
>>>>>>>>>> It requires that a contributor takes a minute to
reflect on what
>>>>>>>>>> was
>>>>>>>>>> touched, and what would be ways to verify that the
changes work
>>>>>>>>>> properly.
>>>>>>>>>> Besides being a help to the reviewer, it also makes
contributors
>>>>>>>>>>
>>>>>>>>> aware
>>>>>>>
>>>>>>> of
>>>>>>>>>>
>>>>>>>>>> what is important during the review process.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I suggest a new pull request template, as attached
below, with a
>>>>>>>>>>
>>>>>>>>> preview
>>>>>>>
>>>>>>> here:
>>>>>>>>>>
>>>>>>>>>> https://github.com/StephanEwen/incubator-flink/
>>>>>>>>>>
>>>>>>>>> blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md
>>>>>>>>>
>>>>>>>>> Don't be scared, it looks long, but a big part is the
introductory
>>>>>>>>> text
>>>>>>>
>>>>>>> (only relevant for new contributors) and the examples contents
for
>>>>>>>>>
>>>>>>>>> the
>>>>>>>
>>>>>>> description.
>>>>>>>>>>
>>>>>>>>>> Filling this out for code that is in shape should
be a quick
>>>>>>>>>> thing:
>>>>>>>>>>
>>>>>>>>> Remove
>>>>>>>>>
>>>>>>>>> the into and checklist, write a few sentences on what
the PR does
>>>>>>>>> (one
>>>>>>>
>>>>>>> should do that anyways) and then pick some yes/no in the
>>>>>>>>>
>>>>>>>>> classification
>>>>>>>
>>>>>>> section.
>>>>>>>>>>
>>>>>>>>>> Curious to hear what you think!
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Stephan
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ============================
>>>>>>>>>>
>>>>>>>>>> Full suggested pull request template:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Thank you very much for contributing to Apache Flink
- we are
>>>>>>>>>> happy
>>>>>>>>>> that
>>>>>>>>>> you want to help us improve Flink. To help the community
review
>>>>>>>>>> you
>>>>>>>>>> contribution in the best possible way, please go
through the
>>>>>>>>>>
>>>>>>>>> checklist
>>>>>>>
>>>>>>> below, which will get the contribution into a shape in which
it can
>>>>>>>>>
>>>>>>>>> be
>>>>>>>
>>>>>>> best
>>>>>>>>>
>>>>>>>>> reviewed.*
>>>>>>>>>>
>>>>>>>>>> *Please understand that we do not do this to make
contributions to
>>>>>>>>>>
>>>>>>>>> Flink
>>>>>>>
>>>>>>> a
>>>>>>>>>
>>>>>>>>> hassle. In order to uphold a high standard of quality
for code
>>>>>>>>>>
>>>>>>>>>> contributions, while at the same time managing a
large number of
>>>>>>>>>> contributions, we need contributors to prepare the
contributions
>>>>>>>>>>
>>>>>>>>> well,
>>>>>>>
>>>>>>> and
>>>>>>>>>
>>>>>>>>> give reviewers enough contextual information for the
review. Please
>>>>>>>>> also
>>>>>>>
>>>>>>> understand that contributions that do not follow this guide will
take
>>>>>>>>>>
>>>>>>>>>> longer to review and thus typically be picked up
with lower
>>>>>>>>>> priority
>>>>>>>>>>
>>>>>>>>> by
>>>>>>>
>>>>>>> the
>>>>>>>>>
>>>>>>>>> community.*
>>>>>>>>>>
>>>>>>>>>> ## Contribution Checklist
>>>>>>>>>>
>>>>>>>>>> - Make sure that the pull request corresponds to
a [JIRA issue](
>>>>>>>>>> https://issues.apache.org/jira/projects/FLINK/issues).
Exceptions
>>>>>>>>>>
>>>>>>>>> are
>>>>>>>
>>>>>>> made
>>>>>>>>>
>>>>>>>>> for typos in JavaDoc or documentation files, which need
no JIRA
>>>>>>>>> issue.
>>>>>>>
>>>>>>> - Name the pull request in the form "[FLINK-1234] [component]
Title
>>>>>>>>>
>>>>>>>>> of
>>>>>>>
>>>>>>> the pull request", where *FLINK-1234* should be replaced by the
>>>>>>>>>
>>>>>>>>> actual
>>>>>>>
>>>>>>> issue number. Skip *component* if you are unsure about which
is the
>>>>>>>>>
>>>>>>>>> best
>>>>>>>
>>>>>>> component.
>>>>>>>>>>
>>>>>>>>>> Typo fixes that have no associated JIRA issue should
be named
>>>>>>>>>>
>>>>>>>>> following
>>>>>>>
>>>>>>> this pattern: `[hotfix] [docs] Fix typo in event time introduction`
>>>>>>>>>
>>>>>>>>> or
>>>>>>>
>>>>>>> `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`.
>>>>>>>>>>
>>>>>>>>>> - Fill out the template below to describe the changes
contributed
>>>>>>>>>> by
>>>>>>>>>>
>>>>>>>>> the
>>>>>>>
>>>>>>> pull request. That will give reviewers the context they need
to do
>>>>>>>>>
>>>>>>>>> the
>>>>>>>
>>>>>>> review.
>>>>>>>>>>
>>>>>>>>>> - Make sure that the change passes the automated
tests, i.e., `mvn
>>>>>>>>>>
>>>>>>>>> clean
>>>>>>>
>>>>>>> verify`
>>>>>>>>>>
>>>>>>>>>> - Each pull request should address only one issue,
not mix up code
>>>>>>>>>>
>>>>>>>>> from
>>>>>>>
>>>>>>> multiple issues.
>>>>>>>>>>
>>>>>>>>>> - Each commit in the pull request has a meaningful
commit message
>>>>>>>>>> (including the JIRA id)
>>>>>>>>>>
>>>>>>>>>> - Once all items of the checklist are addressed,
remove the above
>>>>>>>>>>
>>>>>>>>> text
>>>>>>>
>>>>>>> and this checklist, leaving only the filled out template below.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> **(The sections below can be removed for hotfixes
of typos)**
>>>>>>>>>>
>>>>>>>>>> ## What is the purpose of the change
>>>>>>>>>>
>>>>>>>>>> *(For example: This pull request makes task deployment
go through
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>> blob
>>>>>>>>>
>>>>>>>>> server, rather than through RPC. That way we avoid re-transferring
>>>>>>>>> them
>>>>>>>
>>>>>>> on
>>>>>>>>>
>>>>>>>>> each deployment (during recovery).)*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ## Brief change log
>>>>>>>>>>
>>>>>>>>>> *(for example:)*
>>>>>>>>>> - *The TaskInfo is stored in the blob store on job
creation time
>>>>>>>>>> as a
>>>>>>>>>> persistent artifact*
>>>>>>>>>> - *Deployments RPC transmits only the blob storage
reference*
>>>>>>>>>> - *TaskManagers retrieve the TaskInfo from the blob
cache*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ## Verifying this change
>>>>>>>>>>
>>>>>>>>>> *(Please pick either of the following options)*
>>>>>>>>>>
>>>>>>>>>> This change is a trivial rework / code cleanup without
any test
>>>>>>>>>> coverage.
>>>>>>>>>>
>>>>>>>>>> *(or)*
>>>>>>>>>>
>>>>>>>>>> This change is already covered by existing tests,
such as *(please
>>>>>>>>>>
>>>>>>>>> describe
>>>>>>>>>
>>>>>>>>> tests)*.
>>>>>>>>>>
>>>>>>>>>> *(or)*
>>>>>>>>>>
>>>>>>>>>> This change added tests and can be verified as follows:
>>>>>>>>>>
>>>>>>>>>> *(example:)*
>>>>>>>>>> - *Added integration tests for end-to-end deployment
with large
>>>>>>>>>>
>>>>>>>>> payloads
>>>>>>>
>>>>>>> (100MB)*
>>>>>>>>>>
>>>>>>>>>> - *Extended integration test for recovery after master
>>>>>>>>>> (JobManager)
>>>>>>>>>> failure*
>>>>>>>>>> - *Added test that validates that TaskInfo is transferred
only
>>>>>>>>>> once
>>>>>>>>>> across recoveries*
>>>>>>>>>> - *Manually verified the change by running a 4 node
cluser with 2
>>>>>>>>>> JobManagers and 4 TaskManagers, a stateful streaming
program, and
>>>>>>>>>> killing
>>>>>>>>>> one JobManager and to TaskManagers during the execution,
verifying
>>>>>>>>>>
>>>>>>>>> that
>>>>>>>
>>>>>>> recovery happens correctly.*
>>>>>>>>>>
>>>>>>>>>> ## Does this pull request potentially affect one
of the following
>>>>>>>>>>
>>>>>>>>> parts:
>>>>>>>
>>>>>>> - Dependencies (does it add or upgrade a dependency): **(yes
/ no)**
>>>>>>>>>>
>>>>>>>>>> - The public API, i.e., is any changed class annotated
with
>>>>>>>>>> `@Public(Evolving)`: **(yes / no)**
>>>>>>>>>> - The serializers: **(yes / no / don't know)**
>>>>>>>>>> - The runtime per-record code paths (performance
sensitive):
>>>>>>>>>> **(yes
>>>>>>>>>>
>>>>>>>>> / no
>>>>>>>
>>>>>>> / don't know)**
>>>>>>>>>>
>>>>>>>>>> - Anything that affects deployment or recovery: JobManager
(and
>>>>>>>>>> its
>>>>>>>>>> components), Checkpointing, Yarn/Mesos, ZooKeeper:
**(yes / no /
>>>>>>>>>>
>>>>>>>>> don't
>>>>>>>
>>>>>>> know)**:
>>>>>>>>>>
>>>>>>>>>> ## Documentation
>>>>>>>>>>
>>>>>>>>>> - Does this pull request introduce a new feature?
**(yes / no)**
>>>>>>>>>> - If yes, how is the feature documented? **(not applicable
/ docs
>>>>>>>>>> /
>>>>>>>>>> JavaDocs / not documented)**
>>>>>>>>>>
>

Mime
View raw message