flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: [DISCUSS] A more thorough Pull Request check list and template
Date Wed, 19 Jul 2017 08:04:59 GMT
@Chesnay:

Put text into template => contributor will have to read it
Put link to text into template => most contributors will ignore the link

Yes, that is pretty much what my observation from the past is.



On Tue, Jul 18, 2017 at 11:03 PM, Chesnay Schepler <chesnay@apache.org>
wrote:

> I'm sorry but i can't follow your logic.
>
> Put text into template => contributor will definitely read it
> Put link to text into template => contributor will completely ignore the
> link
>
> The advantage of the link is we don't duplicate the contribution guide in
> the docs and in the template.
> Furthermore, you don't even need to remove something from the template,
> since it's just a single line.
>
>
> On 18.07.2017 19:25, Stephan Ewen wrote:
>
>> Concerning moving text to the contributors guide:
>>
>> I can only say it again: I believe the contribution guide is almost dead
>> text. Very few people read it.
>> Before the current template was introduced, new contributors rarely gave
>> the pull request a name with Jira number. That is a good indicator about
>> how many read this guide.
>> Putting the test in the template is a way that every one reads it.
>>
>>
>> I am also wondering what the concern is.
>> A new contributor should clearly read through a bit of text, to learn what
>> we look for in contributions.
>> A recurring contributor will not have to read it again, simply remove the
>> text from the pull request message and go on.
>>
>> Where is the disadvantage?
>>
>>
>> On Tue, Jul 18, 2017 at 5:35 PM, Nico Kruber <nico@data-artisans.com>
>> wrote:
>>
>> I like the new template but also agree with the text being too long and
>>> would
>>> move the intro to the contributors guide with a link in the PR template.
>>>
>>> Regarding the questions to fill out - I'd like the headings to be short
>>> and
>>> have the affected components last so that documentation is not lost
>>> (although
>>> being more important than this checklist), e.g.:
>>>
>>> * Purpose of the change
>>> * Brief change log
>>> * Verifying the change
>>> * Documentation
>>> * Affected components
>>>
>>> The verification options in the original template look a bit too large
>>> but
>>> it
>>> stresses what tests should be added, especially for bigger changes. Can't
>>> think of a way to make it shorter though.
>>>
>>>
>>> Nico
>>>
>>> On Tuesday, 18 July 2017 11:20:41 CEST Chesnay Schepler wrote:
>>>
>>>> I fully agree with Fabian.
>>>>
>>>> Multiple-choice questions provide little value to the reviewer, since
>>>> the
>>>> validity has to be verified in any case. While text answers have to be
>>>> validated as well,
>>>> they give some hint to the reviewer as to how it can be verified and
>>>> which steps the
>>>> contributor did to do so.
>>>>
>>>> I also agree that it is too long; IMO this is really intimidating to new
>>>> contributors to be greeted with this.
>>>>
>>>> Ideally we only link to the contributors guide and ask 3 questions:
>>>>
>>>>    * What is the problem?
>>>>    * How was it fixed?
>>>>    * How can the fix be verified?
>>>>
>>>> On 18.07.2017 10:47, Fabian Hueske wrote:
>>>>
>>>>> I like the sections about purpose, change log, and verification of the
>>>>> changes.
>>>>>
>>>>> However, I think the proposed template is too much text. This is
>>>>>
>>>> probably
>>>
>>>> the reason why the first attempt to establish a PR template failed.
>>>>> I would move most of the introduction and explanations incl. examples
>>>>>
>>>> to
>>>
>>>> the "Contribution Guidelines" and only pass a link.
>>>>> IMO, the template should be rather shorter than the current one and
>>>>>
>>>> only
>>>
>>>> have the link, the sections to fill out, and checkboxes.
>>>>>
>>>>> I'm also not sure how much the detailed questions will help.
>>>>> For example even if the question about changed dependencies is answered
>>>>> with "no", the reviewer still has to check that.
>>>>>
>>>>> I think the questions of the current template work differently.
>>>>> A question "Does the PR include tests?" suggests to the contributor
>>>>>
>>>> that
>>>
>>>> those should be included. Same for documentation.
>>>>>
>>>>> Cheers,
>>>>> Fabian
>>>>>
>>>>> 2017-07-18 10:05 GMT+02:00 Tzu-Li (Gordon) Tai <tzulitai@apache.org>:
>>>>>
>>>>>> +1, I like this a lot.
>>>>>> With the previous template, it doesn’t really resonate with what
we
>>>>>> should
>>>>>> care about, and therefore most of the time I think contributors just
>>>>>> delete
>>>>>> that template and write down something on their own.
>>>>>>
>>>>>> I would also like to add: “Savepoint / checkpoint binary formats”
to
>>>>>>
>>>>> the
>>>
>>>> potential affect scope check list.
>>>>>> I think changes to those deserves an independent yes / no check of
its
>>>>>> own.
>>>>>>
>>>>>> Cheers,
>>>>>> Gordon
>>>>>>
>>>>>> On 18 July 2017 at 3:49:42 PM, Ufuk Celebi (uce@apache.org) wrote:
>>>>>>
>>>>>> I really like this and vote to change our current template.
>>>>>>
>>>>>> The simple yes/no/... options are a really good idea. I would add
to
>>>>>> your email that the questions will equally help reviewers to remember
>>>>>> to look at these things, which is just as important.
>>>>>>
>>>>>> When we merge this, we should make sure to strictly follow the guide.
>>>>>> Ideally, in the long term we can even automate some of the yes/no/...
>>>>>> questions via a bot... but let's not get ahead of ourselves here
;-)
>>>>>>
>>>>>> On Mon, Jul 17, 2017 at 6:36 PM, Stephan Ewen <sewen@apache.org>
>>>>>>
>>>>> wrote:
>>>
>>>> Hi all!
>>>>>>>
>>>>>>> I have reflected a bit on the pull requests and on some of the
recent
>>>>>>> changes to Flink and some of the introduced bugs / regressions
that
>>>>>>>
>>>>>> we
>>>
>>>> have
>>>>>>
>>>>>> fixed.
>>>>>>>
>>>>>>> One thing that I think would have helped is to have more explicit
>>>>>>> information about what the pull request does and how the contributor
>>>>>>>
>>>>>> would
>>>>>>
>>>>>> suggest to verify it. I have seen this when contributing to some
>>>>>>>
>>>>>> other
>>>
>>>> project and really liked the approach.
>>>>>>>
>>>>>>> It requires that a contributor takes a minute to reflect on what
was
>>>>>>> touched, and what would be ways to verify that the changes work
>>>>>>> properly.
>>>>>>> Besides being a help to the reviewer, it also makes contributors
>>>>>>>
>>>>>> aware
>>>
>>>> of
>>>>>>> what is important during the review process.
>>>>>>>
>>>>>>>
>>>>>>> I suggest a new pull request template, as attached below, with
a
>>>>>>>
>>>>>> preview
>>>
>>>> here:
>>>>>>> https://github.com/StephanEwen/incubator-flink/
>>>>>>>
>>>>>> blob/pr_template/.github/PULL_REQUEST_TEMPLATE.md
>>>>>>
>>>>>> Don't be scared, it looks long, but a big part is the introductory
>>>>>>>
>>>>>> text
>>>
>>>> (only relevant for new contributors) and the examples contents for
>>>>>>>
>>>>>> the
>>>
>>>> description.
>>>>>>> Filling this out for code that is in shape should be a quick
thing:
>>>>>>>
>>>>>> Remove
>>>>>>
>>>>>> the into and checklist, write a few sentences on what the PR does
>>>>>>>
>>>>>> (one
>>>
>>>> should do that anyways) and then pick some yes/no in the
>>>>>>>
>>>>>> classification
>>>
>>>> section.
>>>>>>>
>>>>>>> Curious to hear what you think!
>>>>>>>
>>>>>>> Best,
>>>>>>> Stephan
>>>>>>>
>>>>>>>
>>>>>>> ============================
>>>>>>>
>>>>>>> Full suggested pull request template:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> *Thank you very much for contributing to Apache Flink - we are
happy
>>>>>>> that
>>>>>>> you want to help us improve Flink. To help the community review
you
>>>>>>> contribution in the best possible way, please go through the
>>>>>>>
>>>>>> checklist
>>>
>>>> below, which will get the contribution into a shape in which it can
>>>>>>>
>>>>>> be
>>>
>>>> best
>>>>>>
>>>>>> reviewed.*
>>>>>>>
>>>>>>> *Please understand that we do not do this to make contributions
to
>>>>>>>
>>>>>> Flink
>>>
>>>> a
>>>>>>
>>>>>> hassle. In order to uphold a high standard of quality for code
>>>>>>> contributions, while at the same time managing a large number
of
>>>>>>> contributions, we need contributors to prepare the contributions
>>>>>>>
>>>>>> well,
>>>
>>>> and
>>>>>>
>>>>>> give reviewers enough contextual information for the review. Please
>>>>>>>
>>>>>> also
>>>
>>>> understand that contributions that do not follow this guide will take
>>>>>>> longer to review and thus typically be picked up with lower priority
>>>>>>>
>>>>>> by
>>>
>>>> the
>>>>>>
>>>>>> community.*
>>>>>>>
>>>>>>> ## Contribution Checklist
>>>>>>>
>>>>>>> - Make sure that the pull request corresponds to a [JIRA issue](
>>>>>>> https://issues.apache.org/jira/projects/FLINK/issues). Exceptions
>>>>>>>
>>>>>> are
>>>
>>>> made
>>>>>>
>>>>>> for typos in JavaDoc or documentation files, which need no JIRA
>>>>>>>
>>>>>> issue.
>>>
>>>> - Name the pull request in the form "[FLINK-1234] [component] Title
>>>>>>>
>>>>>> of
>>>
>>>> the pull request", where *FLINK-1234* should be replaced by the
>>>>>>>
>>>>>> actual
>>>
>>>> issue number. Skip *component* if you are unsure about which is the
>>>>>>>
>>>>>> best
>>>
>>>> component.
>>>>>>> Typo fixes that have no associated JIRA issue should be named
>>>>>>>
>>>>>> following
>>>
>>>> this pattern: `[hotfix] [docs] Fix typo in event time introduction`
>>>>>>>
>>>>>> or
>>>
>>>> `[hotfix] [javadocs] Expand JavaDoc for PuncuatedWatermarkGenerator`.
>>>>>>>
>>>>>>> - Fill out the template below to describe the changes contributed
by
>>>>>>>
>>>>>> the
>>>
>>>> pull request. That will give reviewers the context they need to do
>>>>>>>
>>>>>> the
>>>
>>>> review.
>>>>>>>
>>>>>>> - Make sure that the change passes the automated tests, i.e.,
`mvn
>>>>>>>
>>>>>> clean
>>>
>>>> verify`
>>>>>>>
>>>>>>> - Each pull request should address only one issue, not mix up
code
>>>>>>>
>>>>>> from
>>>
>>>> multiple issues.
>>>>>>>
>>>>>>> - Each commit in the pull request has a meaningful commit message
>>>>>>> (including the JIRA id)
>>>>>>>
>>>>>>> - Once all items of the checklist are addressed, remove the above
>>>>>>>
>>>>>> text
>>>
>>>> and this checklist, leaving only the filled out template below.
>>>>>>>
>>>>>>>
>>>>>>> **(The sections below can be removed for hotfixes of typos)**
>>>>>>>
>>>>>>> ## What is the purpose of the change
>>>>>>>
>>>>>>> *(For example: This pull request makes task deployment go through
the
>>>>>>>
>>>>>> blob
>>>>>>
>>>>>> server, rather than through RPC. That way we avoid re-transferring
>>>>>>>
>>>>>> them
>>>
>>>> on
>>>>>>
>>>>>> each deployment (during recovery).)*
>>>>>>>
>>>>>>>
>>>>>>> ## Brief change log
>>>>>>>
>>>>>>> *(for example:)*
>>>>>>> - *The TaskInfo is stored in the blob store on job creation time
as a
>>>>>>> persistent artifact*
>>>>>>> - *Deployments RPC transmits only the blob storage reference*
>>>>>>> - *TaskManagers retrieve the TaskInfo from the blob cache*
>>>>>>>
>>>>>>>
>>>>>>> ## Verifying this change
>>>>>>>
>>>>>>> *(Please pick either of the following options)*
>>>>>>>
>>>>>>> This change is a trivial rework / code cleanup without any test
>>>>>>> coverage.
>>>>>>>
>>>>>>> *(or)*
>>>>>>>
>>>>>>> This change is already covered by existing tests, such as *(please
>>>>>>>
>>>>>> describe
>>>>>>
>>>>>> tests)*.
>>>>>>>
>>>>>>> *(or)*
>>>>>>>
>>>>>>> This change added tests and can be verified as follows:
>>>>>>>
>>>>>>> *(example:)*
>>>>>>> - *Added integration tests for end-to-end deployment with large
>>>>>>>
>>>>>> payloads
>>>
>>>> (100MB)*
>>>>>>> - *Extended integration test for recovery after master (JobManager)
>>>>>>> failure*
>>>>>>> - *Added test that validates that TaskInfo is transferred only
once
>>>>>>> across recoveries*
>>>>>>> - *Manually verified the change by running a 4 node cluser with
2
>>>>>>> JobManagers and 4 TaskManagers, a stateful streaming program,
and
>>>>>>> killing
>>>>>>> one JobManager and to TaskManagers during the execution, verifying
>>>>>>>
>>>>>> that
>>>
>>>> recovery happens correctly.*
>>>>>>>
>>>>>>> ## Does this pull request potentially affect one of the following
>>>>>>>
>>>>>> parts:
>>>
>>>> - Dependencies (does it add or upgrade a dependency): **(yes / no)**
>>>>>>> - The public API, i.e., is any changed class annotated with
>>>>>>> `@Public(Evolving)`: **(yes / no)**
>>>>>>> - The serializers: **(yes / no / don't know)**
>>>>>>> - The runtime per-record code paths (performance sensitive):
**(yes
>>>>>>>
>>>>>> / no
>>>
>>>> / don't know)**
>>>>>>> - Anything that affects deployment or recovery: JobManager (and
its
>>>>>>> components), Checkpointing, Yarn/Mesos, ZooKeeper: **(yes / no
/
>>>>>>>
>>>>>> don't
>>>
>>>> know)**:
>>>>>>>
>>>>>>> ## Documentation
>>>>>>>
>>>>>>> - Does this pull request introduce a new feature? **(yes / no)**
>>>>>>> - If yes, how is the feature documented? **(not applicable /
docs /
>>>>>>> JavaDocs / not documented)**
>>>>>>>
>>>>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message