mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skalicky, Sam" <sska...@amazon.com.INVALID>
Subject Re: MXNet Bot Demo
Date Mon, 16 Mar 2020 17:29:09 GMT
We probably need some way to track which CI runs ran for which commit too, that way we can
ensure that all CI runs ran on the commit that will be merged.  Maybe the bot can comment
with the commit hash when users command it to do something. Although since users can trigger
individual CI runs its possible to have some commits run some CI runs but not others. We need
some way to easily verify that the CI has executed all runs on the commit that will be merged.

Sam

> On Mar 13, 2020, at 8:28 PM, Przemysław Trędak <ptrendx@apache.org> wrote:
> 
> CAUTION: This email originated from outside of the organization. Do not click links or
open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> I personally like the idea of opt-in more than opt-out:
> - ultimately PR author wants the PR to be merged so they (or committer reviewing the
PR) will trigger the CI
> - if it is easy to trigger the PR via the bot command then the amount of work per PR
should be less than with opt-out (since most of the commits should then be marked as [skip
ci] or something similar
> 
> +1 to the bot making a comment on each new PR with its commands (and also explaining,
or at least giving links to the general PR process so new PR authors are not lost). Maybe
we could make the bot recognize if the PR author is new or existing contributor and offer
advice based on that?
> 
> Thanks
> Przemek
> 
> On 2020/03/13 22:06:58, Marco de Abreu <marco.g.abreu@gmail.com> wrote:
>> Hi,
>> 
>> since it's no longer necessary to push a new commit to trigger CI, it will
>> already reduce the costs. But to me, requiring an action to enable CI after
>> a PR has been created initially, is a no go. User can opt out of CI, but
>> the default has to be CI being triggered automatically for every commit
>> unless specifically disabled by a participant. I'm also fine with
>> triggering certain additional jobs (think about running a nightly job upon
>> request for a PR) to require a manual step, but the PR validation pipelines
>> have to run automatically. Every check that is marked as "Required" in
>> GitHub has to be automatically kicked off.
>> 
>> -Marco
>> 
>> On Fri, Mar 13, 2020 at 9:50 PM Chaitanya Bapat <chai.bapat@gmail.com>
>> wrote:
>> 
>>> Firstly,
>>> Sorry I missed out on attaching the mail thread that was sent on 12th
>>> February for notifying the community of the upcoming changes to the MXNet
>>> CI
>>> For reference :
>>> 
>>> https://lists.apache.org/thread.html/r09a6ab2803a996fc80e00fe39ed312fa4865e8805e08df847f1addad%40%3Cdev.mxnet.apache.org%3E
>>> 
>>> Now to the questions,
>>>> Is it possible for re-triggering a single job to be abused?
>>> @Tao In the case when a user re-triggers a single job multiple times, that
>>> will be visible in the PR conversation thread. A committer, even after he
>>> has approved the PR before, generally takes a look at the final state of
>>> the PR before merging. Would it be fair to assume the committer could take
>>> the multiple re-trigger of a single job into account before merging? The
>>> committer then has the option to invoke `@mxnet-bot run ci [all] ` to
>>> trigger the entire build pipeline one last to counter the abuse. This is
>>> aligned with what @Leonard said.
>>> 
>>> @Sandeep Thanks a lot for collecting and sharing valuable data. I'd concur
>>> with the opinion that given the existing things committers & PR Authors
>>> take care of, invoking CI shouldn't be that big of an additional burden.
>>> 
>>> @Marco With the opt-out, the onus remains on the PR Author. It doesn't help
>>> reduce the resource usage. Hence, it was suggested to switch to
>>> opt-in. @Leo's suggestion for proactive commenting on the part of bot makes
>>> sense and is doable.
>>> 
>>> Default : opt-out and User initiated opt-in (with addressing Leo's fix for
>>> the usability issue you correctly pointed out )
>>> @Marco How does this sound to you?
>>> 
>>> Again, thank you all for chiming in and voicing your opinion. Appreciate
>>> it.
>>> We can take ahead these discussions in today's demo meeting. [Design Doc
>>> <https://cwiki.apache.org/confluence/display/MXNET/MXNet+CI+Bot>] [Demo
>>> Video <https://www.youtube.com/watch?v=gfOGwZId8aU>]
>>> 
>>> Thanks,
>>> Chai
>>> 
>>> On Fri, 13 Mar 2020 at 12:34, Marco de Abreu <marco.g.abreu@gmail.com>
>>> wrote:
>>> 
>>>> I'd recommend that the bot makes an initial comment when a PR gets opened
>>>> and informs the users of its commands. It then tells the user the commend
>>>> to opt out of CI.
>>>> 
>>>> -Marco
>>>> 
>>>> Lausen, Leonard <lausen@amazon.com.invalid> schrieb am Fr., 13. März
>>> 2020,
>>>> 20:27:
>>>> 
>>>>> On opt-out: People may be unaware of opt-out would not use it. There
is
>>>> no
>>>>> incentive to use opt-out, as the PR author doesn't pay any money for
CI
>>>>> run.
>>>>> 
>>>>> I agree with Marco though that opt-in alone may cause usability issues,
>>>> as
>>>>> contributors may not be aware of how to trigger the CI.
>>>>> One solution is that the bot proactively comments on the PR and reminds
>>>> the
>>>>> author to trigger running CI once the author deems the PR ready.
>>>>> 
>>>>> But even if we choose opt-out, the bot will still add a lot of value,
>>> as
>>>> PR
>>>>> authors can retrigger single jobs that have failed due to flakiness.
>>>>> 
>>>>>> Is it possible for re-triggering a single job to be abused? For
>>>> example,
>>>>>> the author spends two days re-triggering a flaky job to make it pass.
>>>>> 
>>>>> Yes, this is possible. I suggest the committer who likes to merge a PR
>>>>> needs to
>>>>> make a good judgement here if a PR is abusing the feature, and if so,
>>>>> retrigger
>>>>> all CI runs.
>>>>> 
>>>>> Best regards
>>>>> Leonard
>>>>> 
>>>>> On Fri, 2020-03-13 at 08:07 +0100, Marco de Abreu wrote:
>>>>>> Thanks for the data Sandeep. In these cases it sounds like it would
>>>> have
>>>>>> rather been better when people explicitly turned off CI in that case.
>>>>>> What's the argument against an opt-out instead of an opt-in?
>>>>>> 
>>>>>> My intention is that I consider it quite cumbersome to make it a
>>>>> *required*
>>>>>> step to always trigger CI manually, even if just submitting a small
>>> PR.
>>>>> I'd
>>>>>> rather see people explicitly turning off CI if they wouldn't like
to
>>>> use
>>>>> it
>>>>>> - and there's also the "draft" stage for a PR which some contributors
>>>> are
>>>>>> using.
>>>>>> 
>>>>>> With regards to WIP and do not review: I think these are use cases
>>>> where
>>>>>> you want CI feedback, as otherwise you wouldn't have opened the PR.
>>> If
>>>>> you
>>>>>> don't want human feedback and neither machine feedback, why open
the
>>> PR
>>>>> at
>>>>>> all?
>>>>>> 
>>>>>> -Marco
>>>>>> 
>>>>>> 
>>>>>> sandeep krishnamurthy <sandeep.krishna98@gmail.com> schrieb
am Fr.,
>>>> 13.
>>>>>> März 2020, 05:24:
>>>>>> 
>>>>>>> I tried to gather some data for us to discuss this topic in this
>>>>> thread. I
>>>>>>> tried to count number of un-necessary builds by looking at most
>>>> recent
>>>>> (as
>>>>>>> of 12, March 9 PM PST) 50 PRs merged to master and 50 PRs.
>>>>>>> Identifying un-necessary builds is bit subjective. I tried to
be
>>> more
>>>>>>> conservative where I didn't count a build as un-necessary if
I was
>>> in
>>>>>>> doubt. Hence, I was not able to automate, but I made an effort
to
>>> go
>>>>>>> through PRs manually and use below criteria to identify
>>> un-necessary
>>>>>>> commits triggering the builds.
>>>>>>> 
>>>>>>>   1. Explicitly marked as WIP / do not review  PR
>>>>>>>   2. Incremental WIP commit and finally commenting a commit
>>> “trigger
>>>>> CI”
>>>>>>>   3. Multiple commits to address all comments from single review.
>>>>> This is
>>>>>>>   assuming we see a comment, address them, commit, next the
>>>> following
>>>>>>> comment
>>>>>>>   4. Sequence of documentation only changes
>>>>>>> 
>>>>>>> I found there were around 42 avoidable builds from most recent
50
>>>>> merged
>>>>>>> PRs and around 86 builds from recent 50 open PRs.
>>>>>>> 
>>>>>>> I synced up with other contributors (Joe Evans, Chai) from Amazon
>>> who
>>>>> is
>>>>>>> contributing to MXNet CI system. I was told that on an average
it
>>>> costs
>>>>>>> around $84 per build and on an average 6 commits per merged PR
(for
>>>>> year
>>>>>>> 2019). Going by that, it is approximately 1/6 builds are avoidable.
>>>>> [100 /
>>>>>>> 300 + 300 ]
>>>>>>> 
>>>>>>> 
>>>>>>> Usability should be top most priority. But, since either a reviewer
>>>> or
>>>>> pr
>>>>>>> author can trigger the bot, is it really a hurdle for pr author
or
>>>>> reviewer
>>>>>>> to call a bot to trigger CI? Given that PR author and reviewer
is
>>>>> already
>>>>>>> actively commenting various details such as - PR description,
>>> review
>>>>>>> comments and responses, adding labels etc.
>>>>>>> 
>>>>>>> 
>>>>>>> Me too curious to know the behavior for Tao's above use case.
>>>>>>> 
>>>>>>> 
>>>>>>> Best,
>>>>>>> 
>>>>>>> Sandeep
>>>>>>> 
>>>>>>> On Thu, Mar 12, 2020 at 7:18 PM Tao Lv <mutouorz@gmail.com>
wrote:
>>>>>>> 
>>>>>>>> Is it possible for re-triggering a single job to be abused?
For
>>>>> example,
>>>>>>>> the author spends two days re-triggering a flaky job to make
it
>>>>> pass. But
>>>>>>>> other jobs which have passed the validation may be broken
by
>>> other
>>>>>>> commits
>>>>>>>> during the two day without being noticed. And finally the
PR is
>>>>> merged
>>>>>>> with
>>>>>>>> underlying problems.
>>>>>>>> 
>>>>>>>> On Fri, Mar 13, 2020 at 6:19 AM Marco de Abreu <
>>>>> marco.g.abreu@gmail.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> In the end it only comes down to money, considering that
the
>>>>> system is
>>>>>>>> auto
>>>>>>>>> scaling, making the execution time constant.
>>>>>>>>> 
>>>>>>>>> If we're trading money for usability, I certainly would
prefer
>>>>>>> usability.
>>>>>>>>> I'd rather recommend to spend time on parallelizing test
>>>> execution
>>>>> or
>>>>>>>>> getting rid of integration tests in the PR stage instead
>>> reducing
>>>>> the
>>>>>>>> costs
>>>>>>>>> by making people not use it. But taking a step back to
>>> requiring
>>>>> people
>>>>>>>> to
>>>>>>>>> manually trigger CI again doesn't feel right.
>>>>>>>>> 
>>>>>>>>> I'm happy to see that bot deployed, but I do not agree
with
>>>>> removing
>>>>>>> the
>>>>>>>>> auto trigger functionality for new commits.
>>>>>>>>> 
>>>>>>>>> -Marco
>>>>>>>>> 
>>>>>>>>> Chaitanya Bapat <chai.bapat@gmail.com> schrieb
am Do., 12.
>>> März
>>>>> 2020,
>>>>>>>>> 22:47:
>>>>>>>>> 
>>>>>>>>>> @Marco Thanks for pointing that out.
>>>>>>>>>> Tomorrow i.e. Friday, March 13, 2020 at 3:00 PM -
3:30 PM in
>>>>>>>> (UTC-08:00)
>>>>>>>>>> Pacific Time (US & Canada).
>>>>>>>>>> 
>>>>>>>>>>> When do we expect this bot to be deployed?
>>>>>>>>>> @Lin If all goes well in the next week I can deploy
it to
>>>> public
>>>>>>> Apache
>>>>>>>>>> (provided I get permissions from Apache Infra)
>>>>>>>>>> 
>>>>>>>>>> @Marco Thanks for your feedback.
>>>>>>>>>>> CI system has to support the community without
requiring
>>>>> people to
>>>>>>>>>> constantly shepherd every single run
>>>>>>>>>> We have data for the number of times CI was triggered
>>>>> unnecessarily
>>>>>>>> which
>>>>>>>>>> includes
>>>>>>>>>> - Entire build triggered instead of specific build
>>>>>>>>>> - CI triggered when PR is still work in progress
or not yet
>>>> ready
>>>>>>> (say
>>>>>>>> -
>>>>>>>>>> intermediate commits)
>>>>>>>>>> At the end its a trade-off
>>>>>>>>>> Money, Resources, Time to build for each and every
commit vs
>>>>> Pain of
>>>>>>>>>> triggering builds
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> Scan trigger plugin would poll SCM. Can we use
plugin at
>>>>> scale?
>>>>>>>>>> 
>>>>>>>>>> 1. I haven't tested it on scale. But I think with
the current
>>>>> scale
>>>>>>> of
>>>>>>>>>> MXNet repo (191 open PRs i.e. checking for changes
to 191
>>>>> branches -
>>>>>>> It
>>>>>>>>>> should be manageable)
>>>>>>>>>> 2. What's the purpose of the plugin? tldr; Branch
discovery
>>> or
>>>>> branch
>>>>>>>>>> indexing.
>>>>>>>>>> Scan trigger plugin comes into the picture only once
per PR
>>> per
>>>>> job
>>>>>>>>> (i.e. 8
>>>>>>>>>> times per PR for 8 jobs). It is basically done when
a new PR
>>> is
>>>>> made
>>>>>>>> and
>>>>>>>>>> the job (say unix-cpu hasn't discovered the new PR
branch
>>> yet).
>>>>>>> That's
>>>>>>>>> it.
>>>>>>>>>> So it shouldn't be a problem for public MXNet repo.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Chai
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, 12 Mar 2020 at 14:22, Marco de Abreu <
>>>>>>> marco.g.abreu@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Btw you forgot to set a date and time for the
metting
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, Mar 12, 2020 at 10:18 PM Marco de Abreu
<
>>>>>>>>> marco.g.abreu@gmail.com
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Thanks Chai, I generally like the idea of
the bot. But
>>> I'm
>>>>> not a
>>>>>>>>>>> supporter
>>>>>>>>>>>> of the idea to disable any automatic triggering
>>> (disabling
>>>>> the
>>>>>>>>> webhook
>>>>>>>>>> is
>>>>>>>>>>>> also not an option, considering that this
will disable
>>>> master
>>>>>>>>>> triggers).
>>>>>>>>>>>> The CI system has to support the community
without
>>>> requiring
>>>>>>> people
>>>>>>>>> to
>>>>>>>>>>>> constantly shepherd every single run. Disabling
automatic
>>>>>>>> triggering
>>>>>>>>>>> seems
>>>>>>>>>>>> like a step back to me.
>>>>>>>>>>>> 
>>>>>>>>>>>> Instead, I'd recommend that CI gets triggered
upon every
>>>>> commit
>>>>>>> as
>>>>>>>>>> usual,
>>>>>>>>>>>> but people have the possibility to call a
"command" (i.e.
>>>>> make a
>>>>>>>>>> message
>>>>>>>>>>>> which results in the bot setting a label)
to disable CI
>>>> until
>>>>>>> they
>>>>>>>>>> revoke
>>>>>>>>>>>> it. But the standard should still be that
a new commit
>>>>> triggers a
>>>>>>>> new
>>>>>>>>>> CI
>>>>>>>>>>>> run.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>> https://plugins.jenkins.io/multibranch-scan-webhook-trigger/
>>>>>>>> seems
>>>>>>>>>> like
>>>>>>>>>>>> this would poll SCM. This will incur high
quota
>>>>> restrictions. Are
>>>>>>>> you
>>>>>>>>>>> sure
>>>>>>>>>>>> that you can use that plugin at scale?
>>>>>>>>>>>> 
>>>>>>>>>>>> -Marco
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Mar 12, 2020 at 10:04 PM Lin Yuan
<
>>>>> apeforest@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>>>> Chai,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Awesome work. When do we expect this
bot to be
>>> deployed?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Lin
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Thu, Mar 12, 2020 at 2:00 PM Chaitanya
Bapat <
>>>>>>>>> chai.bapat@gmail.com
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hello MXNet community,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I have built an MXNet Bot <
>>>> https://github.com/mxnet-bot>
>>>>> that
>>>>>>>>>> allows
>>>>>>>>>>> PR
>>>>>>>>>>>>>> Authors, Committers and Jenkins Admins
to trigger CI
>>>>> manually.
>>>>>>>>>>>>>> It handles 2 problems
>>>>>>>>>>>>>> 1. Manual CI trigger instead of existing
automated CI
>>>>> trigger
>>>>>>>>>>>>>> 2. Gives permissions to PR Authors
(in addition to
>>>> MXNet
>>>>>>>>> Committers
>>>>>>>>>>> and
>>>>>>>>>>>>>> Jenkins Admins)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Design Doc :
>>>>>>>>>>>>>> 
>>>>>>> https://cwiki.apache.org/confluence/display/MXNET/MXNet+CI+Bot
>>>>>>>>>>>>>> I urge you all to attend the demonstration
meeting
>>> and
>>>>> lend
>>>>>>> your
>>>>>>>>>> views
>>>>>>>>>>>>> on
>>>>>>>>>>>>>> the same.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thank you,
>>>>>>>>>>>>>> Chai
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> *Meeting Details*:
>>>>>>>>>>>>>> ==============Conference Bridge
>>>> Information==============
>>>>>>>>>>>>>> You have been invited to an online
meeting, powered
>>> by
>>>>> Amazon
>>>>>>>>> Chime.
>>>>>>>>>>>>>> *Chime meeting ID*: *9272158344*
>>>>>>>>>>>>>> Join via Chime clients (manually):
Select 'Meetings >
>>>>> Join a
>>>>>>>>>> Meeting',
>>>>>>>>>>>>> and
>>>>>>>>>>>>>> enter 9272158344
>>>>>>>>>>>>>> Join via Chime clients (auto-call):
If you invite
>>>>> auto-call as
>>>>>>>>>>> attendee,
>>>>>>>>>>>>>> Chime will call you when the meeting
starts, select
>>>>> 'Answer'
>>>>>>>>>>>>>> *Join via browser screen share*:
>>>>> https://chime.aws/9272158344
>>>>>>>>>>>>>> *Join via phone* (US): +1-929-432-4463,,,9272158344#
>>>>>>>>>>>>>> *Join via phone (US toll-free)*:
>>>>> +1-855-552-4463,,,9272158344#
>>>>>>>>>>>>>> International dial-in:
>>>> https://chime.aws/dialinnumbers/
>>>>>>>>>>>>>> In-room video system: Ext: 62000,
Meeting PIN:
>>>>> 9272158344#
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> *Chaitanya Prakash Bapat*
>>>>>>>>>>>>>> *+1 (973) 953-6299*
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> [image: https://www.linkedin.com//in/chaibapat25]
>>>>>>>>>>>>>> <https://github.com/ChaiBapchya>[image:
>>>>>>>>>>>>> https://www.facebook.com/chaibapat
>>>>>>>>>>>>>> ]
>>>>>>>>>>>>>> <https://www.facebook.com/chaibapchya>[image:
>>>>>>>>>>>>>> https://twitter.com/ChaiBapchya]
<
>>>>>>>> https://twitter.com/ChaiBapchya
>>>>>>>>>>>>>> [image:
>>>>>>>>>>>>>> https://www.linkedin.com//in/chaibapat25]
>>>>>>>>>>>>>> <https://www.linkedin.com//in/chaibapchya/>
>>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> *Chaitanya Prakash Bapat*
>>>>>>>>>> *+1 (973) 953-6299*
>>>>>>>>>> 
>>>>>>>>>> [image: https://www.linkedin.com//in/chaibapat25]
>>>>>>>>>> <https://github.com/ChaiBapchya>[image:
>>>>>>>>> https://www.facebook.com/chaibapat
>>>>>>>>>> ]
>>>>>>>>>> <https://www.facebook.com/chaibapchya>[image:
>>>>>>>>>> https://twitter.com/ChaiBapchya] <
>>>>> https://twitter.com/ChaiBapchya
>>>>>>>>>> [image:
>>>>>>>>>> https://www.linkedin.com//in/chaibapat25]
>>>>>>>>>> <https://www.linkedin.com//in/chaibapchya/>
>>>>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Sandeep Krishnamurthy
>>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> *Chaitanya Prakash Bapat*
>>> *+1 (973) 953-6299*
>>> 
>>> [image: https://www.linkedin.com//in/chaibapat25]
>>> <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
>>> ]
>>> <https://www.facebook.com/chaibapchya>[image:
>>> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
>>> https://www.linkedin.com//in/chaibapat25]
>>> <https://www.linkedin.com//in/chaibapchya/>
>>> 
>> 

Mime
View raw message