mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@gmail.com>
Subject Re: MXNet Bot Demo
Date Fri, 13 Mar 2020 22:06:58 GMT
Hi,

since it's no longer necessary to push a new commit to trigger CI, it will
already reduce the costs. But to me, requiring an action to enable CI after
a PR has been created initially, is a no go. User can opt out of CI, but
the default has to be CI being triggered automatically for every commit
unless specifically disabled by a participant. I'm also fine with
triggering certain additional jobs (think about running a nightly job upon
request for a PR) to require a manual step, but the PR validation pipelines
have to run automatically. Every check that is marked as "Required" in
GitHub has to be automatically kicked off.

-Marco

On Fri, Mar 13, 2020 at 9:50 PM Chaitanya Bapat <chai.bapat@gmail.com>
wrote:

> Firstly,
> Sorry I missed out on attaching the mail thread that was sent on 12th
> February for notifying the community of the upcoming changes to the MXNet
> CI
> For reference :
>
> https://lists.apache.org/thread.html/r09a6ab2803a996fc80e00fe39ed312fa4865e8805e08df847f1addad%40%3Cdev.mxnet.apache.org%3E
>
> Now to the questions,
> > Is it possible for re-triggering a single job to be abused?
> @Tao In the case when a user re-triggers a single job multiple times, that
> will be visible in the PR conversation thread. A committer, even after he
> has approved the PR before, generally takes a look at the final state of
> the PR before merging. Would it be fair to assume the committer could take
> the multiple re-trigger of a single job into account before merging? The
> committer then has the option to invoke `@mxnet-bot run ci [all] ` to
> trigger the entire build pipeline one last to counter the abuse. This is
> aligned with what @Leonard said.
>
> @Sandeep Thanks a lot for collecting and sharing valuable data. I'd concur
> with the opinion that given the existing things committers & PR Authors
> take care of, invoking CI shouldn't be that big of an additional burden.
>
> @Marco With the opt-out, the onus remains on the PR Author. It doesn't help
> reduce the resource usage. Hence, it was suggested to switch to
> opt-in. @Leo's suggestion for proactive commenting on the part of bot makes
> sense and is doable.
>
> Default : opt-out and User initiated opt-in (with addressing Leo's fix for
> the usability issue you correctly pointed out )
> @Marco How does this sound to you?
>
> Again, thank you all for chiming in and voicing your opinion. Appreciate
> it.
> We can take ahead these discussions in today's demo meeting. [Design Doc
> <https://cwiki.apache.org/confluence/display/MXNET/MXNet+CI+Bot>] [Demo
> Video <https://www.youtube.com/watch?v=gfOGwZId8aU>]
>
> Thanks,
> Chai
>
> On Fri, 13 Mar 2020 at 12:34, Marco de Abreu <marco.g.abreu@gmail.com>
> wrote:
>
> > I'd recommend that the bot makes an initial comment when a PR gets opened
> > and informs the users of its commands. It then tells the user the commend
> > to opt out of CI.
> >
> > -Marco
> >
> > Lausen, Leonard <lausen@amazon.com.invalid> schrieb am Fr., 13. März
> 2020,
> > 20:27:
> >
> > > On opt-out: People may be unaware of opt-out would not use it. There is
> > no
> > > incentive to use opt-out, as the PR author doesn't pay any money for CI
> > > run.
> > >
> > > I agree with Marco though that opt-in alone may cause usability issues,
> > as
> > > contributors may not be aware of how to trigger the CI.
> > > One solution is that the bot proactively comments on the PR and reminds
> > the
> > > author to trigger running CI once the author deems the PR ready.
> > >
> > > But even if we choose opt-out, the bot will still add a lot of value,
> as
> > PR
> > > authors can retrigger single jobs that have failed due to flakiness.
> > >
> > > > Is it possible for re-triggering a single job to be abused? For
> > example,
> > > > the author spends two days re-triggering a flaky job to make it pass.
> > >
> > > Yes, this is possible. I suggest the committer who likes to merge a PR
> > > needs to
> > > make a good judgement here if a PR is abusing the feature, and if so,
> > > retrigger
> > > all CI runs.
> > >
> > > Best regards
> > > Leonard
> > >
> > > On Fri, 2020-03-13 at 08:07 +0100, Marco de Abreu wrote:
> > > > Thanks for the data Sandeep. In these cases it sounds like it would
> > have
> > > > rather been better when people explicitly turned off CI in that case.
> > > > What's the argument against an opt-out instead of an opt-in?
> > > >
> > > > My intention is that I consider it quite cumbersome to make it a
> > > *required*
> > > > step to always trigger CI manually, even if just submitting a small
> PR.
> > > I'd
> > > > rather see people explicitly turning off CI if they wouldn't like to
> > use
> > > it
> > > > - and there's also the "draft" stage for a PR which some contributors
> > are
> > > > using.
> > > >
> > > > With regards to WIP and do not review: I think these are use cases
> > where
> > > > you want CI feedback, as otherwise you wouldn't have opened the PR.
> If
> > > you
> > > > don't want human feedback and neither machine feedback, why open the
> PR
> > > at
> > > > all?
> > > >
> > > > -Marco
> > > >
> > > >
> > > > sandeep krishnamurthy <sandeep.krishna98@gmail.com> schrieb am Fr.,
> > 13.
> > > > März 2020, 05:24:
> > > >
> > > > > I tried to gather some data for us to discuss this topic in this
> > > thread. I
> > > > > tried to count number of un-necessary builds by looking at most
> > recent
> > > (as
> > > > > of 12, March 9 PM PST) 50 PRs merged to master and 50 PRs.
> > > > > Identifying un-necessary builds is bit subjective. I tried to be
> more
> > > > > conservative where I didn't count a build as un-necessary if I was
> in
> > > > > doubt. Hence, I was not able to automate, but I made an effort to
> go
> > > > > through PRs manually and use below criteria to identify
> un-necessary
> > > > > commits triggering the builds.
> > > > >
> > > > >    1. Explicitly marked as WIP / do not review  PR
> > > > >    2. Incremental WIP commit and finally commenting a commit
> “trigger
> > > CI”
> > > > >    3. Multiple commits to address all comments from single review.
> > > This is
> > > > >    assuming we see a comment, address them, commit, next the
> > following
> > > > > comment
> > > > >    4. Sequence of documentation only changes
> > > > >
> > > > > I found there were around 42 avoidable builds from most recent 50
> > > merged
> > > > > PRs and around 86 builds from recent 50 open PRs.
> > > > >
> > > > > I synced up with other contributors (Joe Evans, Chai) from Amazon
> who
> > > is
> > > > > contributing to MXNet CI system. I was told that on an average it
> > costs
> > > > > around $84 per build and on an average 6 commits per merged PR (for
> > > year
> > > > > 2019). Going by that, it is approximately 1/6 builds are avoidable.
> > > [100 /
> > > > > 300 + 300 ]
> > > > >
> > > > >
> > > > > Usability should be top most priority. But, since either a reviewer
> > or
> > > pr
> > > > > author can trigger the bot, is it really a hurdle for pr author or
> > > reviewer
> > > > > to call a bot to trigger CI? Given that PR author and reviewer is
> > > already
> > > > > actively commenting various details such as - PR description,
> review
> > > > > comments and responses, adding labels etc.
> > > > >
> > > > >
> > > > > Me too curious to know the behavior for Tao's above use case.
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Sandeep
> > > > >
> > > > > On Thu, Mar 12, 2020 at 7:18 PM Tao Lv <mutouorz@gmail.com>
wrote:
> > > > >
> > > > > > Is it possible for re-triggering a single job to be abused?
For
> > > example,
> > > > > > the author spends two days re-triggering a flaky job to make
it
> > > pass. But
> > > > > > other jobs which have passed the validation may be broken by
> other
> > > > > commits
> > > > > > during the two day without being noticed. And finally the PR
is
> > > merged
> > > > > with
> > > > > > underlying problems.
> > > > > >
> > > > > > On Fri, Mar 13, 2020 at 6:19 AM Marco de Abreu <
> > > marco.g.abreu@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > In the end it only comes down to money, considering that
the
> > > system is
> > > > > > auto
> > > > > > > scaling, making the execution time constant.
> > > > > > >
> > > > > > > If we're trading money for usability, I certainly would
prefer
> > > > > usability.
> > > > > > > I'd rather recommend to spend time on parallelizing test
> > execution
> > > or
> > > > > > > getting rid of integration tests in the PR stage instead
> reducing
> > > the
> > > > > > costs
> > > > > > > by making people not use it. But taking a step back to
> requiring
> > > people
> > > > > > to
> > > > > > > manually trigger CI again doesn't feel right.
> > > > > > >
> > > > > > > I'm happy to see that bot deployed, but I do not agree
with
> > > removing
> > > > > the
> > > > > > > auto trigger functionality for new commits.
> > > > > > >
> > > > > > > -Marco
> > > > > > >
> > > > > > > Chaitanya Bapat <chai.bapat@gmail.com> schrieb am
Do., 12.
> März
> > > 2020,
> > > > > > > 22:47:
> > > > > > >
> > > > > > > > @Marco Thanks for pointing that out.
> > > > > > > > Tomorrow i.e. Friday, March 13, 2020 at 3:00 PM -
3:30 PM in
> > > > > > (UTC-08:00)
> > > > > > > > Pacific Time (US & Canada).
> > > > > > > >
> > > > > > > > > When do we expect this bot to be deployed?
> > > > > > > > @Lin If all goes well in the next week I can deploy
it to
> > public
> > > > > Apache
> > > > > > > > (provided I get permissions from Apache Infra)
> > > > > > > >
> > > > > > > > @Marco Thanks for your feedback.
> > > > > > > > > CI system has to support the community without
requiring
> > > people to
> > > > > > > > constantly shepherd every single run
> > > > > > > > We have data for the number of times CI was triggered
> > > unnecessarily
> > > > > > which
> > > > > > > > includes
> > > > > > > > - Entire build triggered instead of specific build
> > > > > > > > - CI triggered when PR is still work in progress or
not yet
> > ready
> > > > > (say
> > > > > > -
> > > > > > > > intermediate commits)
> > > > > > > > At the end its a trade-off
> > > > > > > > Money, Resources, Time to build for each and every
commit vs
> > > Pain of
> > > > > > > > triggering builds
> > > > > > > >
> > > > > > > >
> > > > > > > > >  Scan trigger plugin would poll SCM. Can we use
plugin at
> > > scale?
> > > > > > > >
> > > > > > > > 1. I haven't tested it on scale. But I think with
the current
> > > scale
> > > > > of
> > > > > > > > MXNet repo (191 open PRs i.e. checking for changes
to 191
> > > branches -
> > > > > It
> > > > > > > > should be manageable)
> > > > > > > > 2. What's the purpose of the plugin? tldr; Branch
discovery
> or
> > > branch
> > > > > > > > indexing.
> > > > > > > > Scan trigger plugin comes into the picture only once
per PR
> per
> > > job
> > > > > > > (i.e. 8
> > > > > > > > times per PR for 8 jobs). It is basically done when
a new PR
> is
> > > made
> > > > > > and
> > > > > > > > the job (say unix-cpu hasn't discovered the new PR
branch
> yet).
> > > > > That's
> > > > > > > it.
> > > > > > > > So it shouldn't be a problem for public MXNet repo.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chai
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, 12 Mar 2020 at 14:22, Marco de Abreu <
> > > > > marco.g.abreu@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Btw you forgot to set a date and time for the
metting
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 12, 2020 at 10:18 PM Marco de Abreu
<
> > > > > > > marco.g.abreu@gmail.com
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks Chai, I generally like the idea of
the bot. But
> I'm
> > > not a
> > > > > > > > > supporter
> > > > > > > > > > of the idea to disable any automatic triggering
> (disabling
> > > the
> > > > > > > webhook
> > > > > > > > is
> > > > > > > > > > also not an option, considering that this
will disable
> > master
> > > > > > > > triggers).
> > > > > > > > > > The CI system has to support the community
without
> > requiring
> > > > > people
> > > > > > > to
> > > > > > > > > > constantly shepherd every single run. Disabling
automatic
> > > > > > triggering
> > > > > > > > > seems
> > > > > > > > > > like a step back to me.
> > > > > > > > > >
> > > > > > > > > > Instead, I'd recommend that CI gets triggered
upon every
> > > commit
> > > > > as
> > > > > > > > usual,
> > > > > > > > > > but people have the possibility to call
a "command" (i.e.
> > > make a
> > > > > > > > message
> > > > > > > > > > which results in the bot setting a label)
to disable CI
> > until
> > > > > they
> > > > > > > > revoke
> > > > > > > > > > it. But the standard should still be that
a new commit
> > > triggers a
> > > > > > new
> > > > > > > > CI
> > > > > > > > > > run.
> > > > > > > > > >
> > > > > > > > > >
> > https://plugins.jenkins.io/multibranch-scan-webhook-trigger/
> > > > > > seems
> > > > > > > > like
> > > > > > > > > > this would poll SCM. This will incur high
quota
> > > restrictions. Are
> > > > > > you
> > > > > > > > > sure
> > > > > > > > > > that you can use that plugin at scale?
> > > > > > > > > >
> > > > > > > > > > -Marco
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 12, 2020 at 10:04 PM Lin Yuan
<
> > > apeforest@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > > Chai,
> > > > > > > > > > >
> > > > > > > > > > > Awesome work. When do we expect this
bot to be
> deployed?
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > >
> > > > > > > > > > > Lin
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 12, 2020 at 2:00 PM Chaitanya
Bapat <
> > > > > > > chai.bapat@gmail.com
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hello MXNet community,
> > > > > > > > > > > >
> > > > > > > > > > > > I have built an MXNet Bot <
> > https://github.com/mxnet-bot>
> > > that
> > > > > > > > allows
> > > > > > > > > PR
> > > > > > > > > > > > Authors, Committers and Jenkins
Admins to trigger CI
> > > manually.
> > > > > > > > > > > > It handles 2 problems
> > > > > > > > > > > > 1. Manual CI trigger instead of
existing automated CI
> > > trigger
> > > > > > > > > > > > 2. Gives permissions to PR Authors
(in addition to
> > MXNet
> > > > > > > Committers
> > > > > > > > > and
> > > > > > > > > > > > Jenkins Admins)
> > > > > > > > > > > >
> > > > > > > > > > > > Design Doc :
> > > > > > > > > > > >
> > > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+CI+Bot
> > > > > > > > > > > > I urge you all to attend the demonstration
meeting
> and
> > > lend
> > > > > your
> > > > > > > > views
> > > > > > > > > > > on
> > > > > > > > > > > > the same.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you,
> > > > > > > > > > > > Chai
> > > > > > > > > > > >
> > > > > > > > > > > > *Meeting Details*:
> > > > > > > > > > > > ==============Conference Bridge
> > Information==============
> > > > > > > > > > > > You have been invited to an online
meeting, powered
> by
> > > Amazon
> > > > > > > Chime.
> > > > > > > > > > > > *Chime meeting ID*: *9272158344*
> > > > > > > > > > > > Join via Chime clients (manually):
Select 'Meetings >
> > > Join a
> > > > > > > > Meeting',
> > > > > > > > > > > and
> > > > > > > > > > > > enter 9272158344
> > > > > > > > > > > > Join via Chime clients (auto-call):
If you invite
> > > auto-call as
> > > > > > > > > attendee,
> > > > > > > > > > > > Chime will call you when the meeting
starts, select
> > > 'Answer'
> > > > > > > > > > > > *Join via browser screen share*:
> > > https://chime.aws/9272158344
> > > > > > > > > > > > *Join via phone* (US): +1-929-432-4463,,,9272158344#
> > > > > > > > > > > > *Join via phone (US toll-free)*:
> > > +1-855-552-4463,,,9272158344#
> > > > > > > > > > > > International dial-in:
> > https://chime.aws/dialinnumbers/
> > > > > > > > > > > > In-room video system: Ext: 62000,
Meeting PIN:
> > > 9272158344#
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > *Chaitanya Prakash Bapat*
> > > > > > > > > > > > *+1 (973) 953-6299*
> > > > > > > > > > > >
> > > > > > > > > > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > > > > <https://github.com/ChaiBapchya>[image:
> > > > > > > > > > > https://www.facebook.com/chaibapat
> > > > > > > > > > > > ]
> > > > > > > > > > > > <https://www.facebook.com/chaibapchya>[image:
> > > > > > > > > > > > https://twitter.com/ChaiBapchya]
<
> > > > > > https://twitter.com/ChaiBapchya
> > > > > > > > > > > > [image:
> > > > > > > > > > > > https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > > > > <https://www.linkedin.com//in/chaibapchya/>
> > > > > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Chaitanya Prakash Bapat*
> > > > > > > > *+1 (973) 953-6299*
> > > > > > > >
> > > > > > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > > > > > <https://github.com/ChaiBapchya>[image:
> > > > > > > https://www.facebook.com/chaibapat
> > > > > > > > ]
> > > > > > > > <https://www.facebook.com/chaibapchya>[image:
> > > > > > > > https://twitter.com/ChaiBapchya] <
> > > https://twitter.com/ChaiBapchya
> > > > > > > > [image:
> > > > > > > > https://www.linkedin.com//in/chaibapat25]
> > > > > > > > <https://www.linkedin.com//in/chaibapchya/>
> > > > > > > >
> > > > >
> > > > > --
> > > > > Sandeep Krishnamurthy
> > > > >
> > >
> >
>
>
> --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
> ]
> <https://www.facebook.com/chaibapchya>[image:
> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> https://www.linkedin.com//in/chaibapat25]
> <https://www.linkedin.com//in/chaibapchya/>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message