mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Przemysław Trędak <ptre...@apache.org>
Subject Re: MXNet Bot Demo
Date Sat, 14 Mar 2020 03:28:48 GMT
I personally like the idea of opt-in more than opt-out:
 - ultimately PR author wants the PR to be merged so they (or committer reviewing the PR)
will trigger the CI
 - if it is easy to trigger the PR via the bot command then the amount of work per PR should
be less than with opt-out (since most of the commits should then be marked as [skip ci] or
something similar

+1 to the bot making a comment on each new PR with its commands (and also explaining, or at
least giving links to the general PR process so new PR authors are not lost). Maybe we could
make the bot recognize if the PR author is new or existing contributor and offer advice based
on that?

Thanks
Przemek

On 2020/03/13 22:06:58, Marco de Abreu <marco.g.abreu@gmail.com> wrote: 
> Hi,
> 
> since it's no longer necessary to push a new commit to trigger CI, it will
> already reduce the costs. But to me, requiring an action to enable CI after
> a PR has been created initially, is a no go. User can opt out of CI, but
> the default has to be CI being triggered automatically for every commit
> unless specifically disabled by a participant. I'm also fine with
> triggering certain additional jobs (think about running a nightly job upon
> request for a PR) to require a manual step, but the PR validation pipelines
> have to run automatically. Every check that is marked as "Required" in
> GitHub has to be automatically kicked off.
> 
> -Marco
> 
> On Fri, Mar 13, 2020 at 9:50 PM Chaitanya Bapat <chai.bapat@gmail.com>
> wrote:
> 
> > Firstly,
> > Sorry I missed out on attaching the mail thread that was sent on 12th
> > February for notifying the community of the upcoming changes to the MXNet
> > CI
> > For reference :
> >
> > https://lists.apache.org/thread.html/r09a6ab2803a996fc80e00fe39ed312fa4865e8805e08df847f1addad%40%3Cdev.mxnet.apache.org%3E
> >
> > Now to the questions,
> > > Is it possible for re-triggering a single job to be abused?
> > @Tao In the case when a user re-triggers a single job multiple times, that
> > will be visible in the PR conversation thread. A committer, even after he
> > has approved the PR before, generally takes a look at the final state of
> > the PR before merging. Would it be fair to assume the committer could take
> > the multiple re-trigger of a single job into account before merging? The
> > committer then has the option to invoke `@mxnet-bot run ci [all] ` to
> > trigger the entire build pipeline one last to counter the abuse. This is
> > aligned with what @Leonard said.
> >
> > @Sandeep Thanks a lot for collecting and sharing valuable data. I'd concur
> > with the opinion that given the existing things committers & PR Authors
> > take care of, invoking CI shouldn't be that big of an additional burden.
> >
> > @Marco With the opt-out, the onus remains on the PR Author. It doesn't help
> > reduce the resource usage. Hence, it was suggested to switch to
> > opt-in. @Leo's suggestion for proactive commenting on the part of bot makes
> > sense and is doable.
> >
> > Default : opt-out and User initiated opt-in (with addressing Leo's fix for
> > the usability issue you correctly pointed out )
> > @Marco How does this sound to you?
> >
> > Again, thank you all for chiming in and voicing your opinion. Appreciate
> > it.
> > We can take ahead these discussions in today's demo meeting. [Design Doc
> > <https://cwiki.apache.org/confluence/display/MXNET/MXNet+CI+Bot>] [Demo
> > Video <https://www.youtube.com/watch?v=gfOGwZId8aU>]
> >
> > Thanks,
> > Chai
> >
> > On Fri, 13 Mar 2020 at 12:34, Marco de Abreu <marco.g.abreu@gmail.com>
> > wrote:
> >
> > > I'd recommend that the bot makes an initial comment when a PR gets opened
> > > and informs the users of its commands. It then tells the user the commend
> > > to opt out of CI.
> > >
> > > -Marco
> > >
> > > Lausen, Leonard <lausen@amazon.com.invalid> schrieb am Fr., 13. März
> > 2020,
> > > 20:27:
> > >
> > > > On opt-out: People may be unaware of opt-out would not use it. There is
> > > no
> > > > incentive to use opt-out, as the PR author doesn't pay any money for CI
> > > > run.
> > > >
> > > > I agree with Marco though that opt-in alone may cause usability issues,
> > > as
> > > > contributors may not be aware of how to trigger the CI.
> > > > One solution is that the bot proactively comments on the PR and reminds
> > > the
> > > > author to trigger running CI once the author deems the PR ready.
> > > >
> > > > But even if we choose opt-out, the bot will still add a lot of value,
> > as
> > > PR
> > > > authors can retrigger single jobs that have failed due to flakiness.
> > > >
> > > > > Is it possible for re-triggering a single job to be abused? For
> > > example,
> > > > > the author spends two days re-triggering a flaky job to make it pass.
> > > >
> > > > Yes, this is possible. I suggest the committer who likes to merge a PR
> > > > needs to
> > > > make a good judgement here if a PR is abusing the feature, and if so,
> > > > retrigger
> > > > all CI runs.
> > > >
> > > > Best regards
> > > > Leonard
> > > >
> > > > On Fri, 2020-03-13 at 08:07 +0100, Marco de Abreu wrote:
> > > > > Thanks for the data Sandeep. In these cases it sounds like it would
> > > have
> > > > > rather been better when people explicitly turned off CI in that case.
> > > > > What's the argument against an opt-out instead of an opt-in?
> > > > >
> > > > > My intention is that I consider it quite cumbersome to make it a
> > > > *required*
> > > > > step to always trigger CI manually, even if just submitting a small
> > PR.
> > > > I'd
> > > > > rather see people explicitly turning off CI if they wouldn't like
to
> > > use
> > > > it
> > > > > - and there's also the "draft" stage for a PR which some contributors
> > > are
> > > > > using.
> > > > >
> > > > > With regards to WIP and do not review: I think these are use cases
> > > where
> > > > > you want CI feedback, as otherwise you wouldn't have opened the PR.
> > If
> > > > you
> > > > > don't want human feedback and neither machine feedback, why open
the
> > PR
> > > > at
> > > > > all?
> > > > >
> > > > > -Marco
> > > > >
> > > > >
> > > > > sandeep krishnamurthy <sandeep.krishna98@gmail.com> schrieb
am Fr.,
> > > 13.
> > > > > März 2020, 05:24:
> > > > >
> > > > > > I tried to gather some data for us to discuss this topic in
this
> > > > thread. I
> > > > > > tried to count number of un-necessary builds by looking at most
> > > recent
> > > > (as
> > > > > > of 12, March 9 PM PST) 50 PRs merged to master and 50 PRs.
> > > > > > Identifying un-necessary builds is bit subjective. I tried to
be
> > more
> > > > > > conservative where I didn't count a build as un-necessary if
I was
> > in
> > > > > > doubt. Hence, I was not able to automate, but I made an effort
to
> > go
> > > > > > through PRs manually and use below criteria to identify
> > un-necessary
> > > > > > commits triggering the builds.
> > > > > >
> > > > > >    1. Explicitly marked as WIP / do not review  PR
> > > > > >    2. Incremental WIP commit and finally commenting a commit
> > “trigger
> > > > CI”
> > > > > >    3. Multiple commits to address all comments from single review.
> > > > This is
> > > > > >    assuming we see a comment, address them, commit, next the
> > > following
> > > > > > comment
> > > > > >    4. Sequence of documentation only changes
> > > > > >
> > > > > > I found there were around 42 avoidable builds from most recent
50
> > > > merged
> > > > > > PRs and around 86 builds from recent 50 open PRs.
> > > > > >
> > > > > > I synced up with other contributors (Joe Evans, Chai) from Amazon
> > who
> > > > is
> > > > > > contributing to MXNet CI system. I was told that on an average
it
> > > costs
> > > > > > around $84 per build and on an average 6 commits per merged
PR (for
> > > > year
> > > > > > 2019). Going by that, it is approximately 1/6 builds are avoidable.
> > > > [100 /
> > > > > > 300 + 300 ]
> > > > > >
> > > > > >
> > > > > > Usability should be top most priority. But, since either a reviewer
> > > or
> > > > pr
> > > > > > author can trigger the bot, is it really a hurdle for pr author
or
> > > > reviewer
> > > > > > to call a bot to trigger CI? Given that PR author and reviewer
is
> > > > already
> > > > > > actively commenting various details such as - PR description,
> > review
> > > > > > comments and responses, adding labels etc.
> > > > > >
> > > > > >
> > > > > > Me too curious to know the behavior for Tao's above use case.
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > >
> > > > > > Sandeep
> > > > > >
> > > > > > On Thu, Mar 12, 2020 at 7:18 PM Tao Lv <mutouorz@gmail.com>
wrote:
> > > > > >
> > > > > > > Is it possible for re-triggering a single job to be abused?
For
> > > > example,
> > > > > > > the author spends two days re-triggering a flaky job to
make it
> > > > pass. But
> > > > > > > other jobs which have passed the validation may be broken
by
> > other
> > > > > > commits
> > > > > > > during the two day without being noticed. And finally the
PR is
> > > > merged
> > > > > > with
> > > > > > > underlying problems.
> > > > > > >
> > > > > > > On Fri, Mar 13, 2020 at 6:19 AM Marco de Abreu <
> > > > marco.g.abreu@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > In the end it only comes down to money, considering
that the
> > > > system is
> > > > > > > auto
> > > > > > > > scaling, making the execution time constant.
> > > > > > > >
> > > > > > > > If we're trading money for usability, I certainly
would prefer
> > > > > > usability.
> > > > > > > > I'd rather recommend to spend time on parallelizing
test
> > > execution
> > > > or
> > > > > > > > getting rid of integration tests in the PR stage instead
> > reducing
> > > > the
> > > > > > > costs
> > > > > > > > by making people not use it. But taking a step back
to
> > requiring
> > > > people
> > > > > > > to
> > > > > > > > manually trigger CI again doesn't feel right.
> > > > > > > >
> > > > > > > > I'm happy to see that bot deployed, but I do not agree
with
> > > > removing
> > > > > > the
> > > > > > > > auto trigger functionality for new commits.
> > > > > > > >
> > > > > > > > -Marco
> > > > > > > >
> > > > > > > > Chaitanya Bapat <chai.bapat@gmail.com> schrieb
am Do., 12.
> > März
> > > > 2020,
> > > > > > > > 22:47:
> > > > > > > >
> > > > > > > > > @Marco Thanks for pointing that out.
> > > > > > > > > Tomorrow i.e. Friday, March 13, 2020 at 3:00
PM - 3:30 PM in
> > > > > > > (UTC-08:00)
> > > > > > > > > Pacific Time (US & Canada).
> > > > > > > > >
> > > > > > > > > > When do we expect this bot to be deployed?
> > > > > > > > > @Lin If all goes well in the next week I can
deploy it to
> > > public
> > > > > > Apache
> > > > > > > > > (provided I get permissions from Apache Infra)
> > > > > > > > >
> > > > > > > > > @Marco Thanks for your feedback.
> > > > > > > > > > CI system has to support the community without
requiring
> > > > people to
> > > > > > > > > constantly shepherd every single run
> > > > > > > > > We have data for the number of times CI was triggered
> > > > unnecessarily
> > > > > > > which
> > > > > > > > > includes
> > > > > > > > > - Entire build triggered instead of specific
build
> > > > > > > > > - CI triggered when PR is still work in progress
or not yet
> > > ready
> > > > > > (say
> > > > > > > -
> > > > > > > > > intermediate commits)
> > > > > > > > > At the end its a trade-off
> > > > > > > > > Money, Resources, Time to build for each and
every commit vs
> > > > Pain of
> > > > > > > > > triggering builds
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >  Scan trigger plugin would poll SCM. Can
we use plugin at
> > > > scale?
> > > > > > > > >
> > > > > > > > > 1. I haven't tested it on scale. But I think
with the current
> > > > scale
> > > > > > of
> > > > > > > > > MXNet repo (191 open PRs i.e. checking for changes
to 191
> > > > branches -
> > > > > > It
> > > > > > > > > should be manageable)
> > > > > > > > > 2. What's the purpose of the plugin? tldr; Branch
discovery
> > or
> > > > branch
> > > > > > > > > indexing.
> > > > > > > > > Scan trigger plugin comes into the picture only
once per PR
> > per
> > > > job
> > > > > > > > (i.e. 8
> > > > > > > > > times per PR for 8 jobs). It is basically done
when a new PR
> > is
> > > > made
> > > > > > > and
> > > > > > > > > the job (say unix-cpu hasn't discovered the new
PR branch
> > yet).
> > > > > > That's
> > > > > > > > it.
> > > > > > > > > So it shouldn't be a problem for public MXNet
repo.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Chai
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, 12 Mar 2020 at 14:22, Marco de Abreu
<
> > > > > > marco.g.abreu@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Btw you forgot to set a date and time for
the metting
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 12, 2020 at 10:18 PM Marco de
Abreu <
> > > > > > > > marco.g.abreu@gmail.com
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks Chai, I generally like the idea
of the bot. But
> > I'm
> > > > not a
> > > > > > > > > > supporter
> > > > > > > > > > > of the idea to disable any automatic
triggering
> > (disabling
> > > > the
> > > > > > > > webhook
> > > > > > > > > is
> > > > > > > > > > > also not an option, considering that
this will disable
> > > master
> > > > > > > > > triggers).
> > > > > > > > > > > The CI system has to support the community
without
> > > requiring
> > > > > > people
> > > > > > > > to
> > > > > > > > > > > constantly shepherd every single run.
Disabling automatic
> > > > > > > triggering
> > > > > > > > > > seems
> > > > > > > > > > > like a step back to me.
> > > > > > > > > > >
> > > > > > > > > > > Instead, I'd recommend that CI gets
triggered upon every
> > > > commit
> > > > > > as
> > > > > > > > > usual,
> > > > > > > > > > > but people have the possibility to
call a "command" (i.e.
> > > > make a
> > > > > > > > > message
> > > > > > > > > > > which results in the bot setting a
label) to disable CI
> > > until
> > > > > > they
> > > > > > > > > revoke
> > > > > > > > > > > it. But the standard should still be
that a new commit
> > > > triggers a
> > > > > > > new
> > > > > > > > > CI
> > > > > > > > > > > run.
> > > > > > > > > > >
> > > > > > > > > > >
> > > https://plugins.jenkins.io/multibranch-scan-webhook-trigger/
> > > > > > > seems
> > > > > > > > > like
> > > > > > > > > > > this would poll SCM. This will incur
high quota
> > > > restrictions. Are
> > > > > > > you
> > > > > > > > > > sure
> > > > > > > > > > > that you can use that plugin at scale?
> > > > > > > > > > >
> > > > > > > > > > > -Marco
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 12, 2020 at 10:04 PM Lin
Yuan <
> > > > apeforest@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > > Chai,
> > > > > > > > > > > >
> > > > > > > > > > > > Awesome work. When do we expect
this bot to be
> > deployed?
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > >
> > > > > > > > > > > > Lin
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 12, 2020 at 2:00 PM
Chaitanya Bapat <
> > > > > > > > chai.bapat@gmail.com
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello MXNet community,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have built an MXNet Bot
<
> > > https://github.com/mxnet-bot>
> > > > that
> > > > > > > > > allows
> > > > > > > > > > PR
> > > > > > > > > > > > > Authors, Committers and Jenkins
Admins to trigger CI
> > > > manually.
> > > > > > > > > > > > > It handles 2 problems
> > > > > > > > > > > > > 1. Manual CI trigger instead
of existing automated CI
> > > > trigger
> > > > > > > > > > > > > 2. Gives permissions to PR
Authors (in addition to
> > > MXNet
> > > > > > > > Committers
> > > > > > > > > > and
> > > > > > > > > > > > > Jenkins Admins)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Design Doc :
> > > > > > > > > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/MXNET/MXNet+CI+Bot
> > > > > > > > > > > > > I urge you all to attend
the demonstration meeting
> > and
> > > > lend
> > > > > > your
> > > > > > > > > views
> > > > > > > > > > > > on
> > > > > > > > > > > > > the same.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you,
> > > > > > > > > > > > > Chai
> > > > > > > > > > > > >
> > > > > > > > > > > > > *Meeting Details*:
> > > > > > > > > > > > > ==============Conference
Bridge
> > > Information==============
> > > > > > > > > > > > > You have been invited to
an online meeting, powered
> > by
> > > > Amazon
> > > > > > > > Chime.
> > > > > > > > > > > > > *Chime meeting ID*: *9272158344*
> > > > > > > > > > > > > Join via Chime clients (manually):
Select 'Meetings >
> > > > Join a
> > > > > > > > > Meeting',
> > > > > > > > > > > > and
> > > > > > > > > > > > > enter 9272158344
> > > > > > > > > > > > > Join via Chime clients (auto-call):
If you invite
> > > > auto-call as
> > > > > > > > > > attendee,
> > > > > > > > > > > > > Chime will call you when
the meeting starts, select
> > > > 'Answer'
> > > > > > > > > > > > > *Join via browser screen
share*:
> > > > https://chime.aws/9272158344
> > > > > > > > > > > > > *Join via phone* (US): +1-929-432-4463,,,9272158344#
> > > > > > > > > > > > > *Join via phone (US toll-free)*:
> > > > +1-855-552-4463,,,9272158344#
> > > > > > > > > > > > > International dial-in:
> > > https://chime.aws/dialinnumbers/
> > > > > > > > > > > > > In-room video system: Ext:
62000, Meeting PIN:
> > > > 9272158344#
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > *Chaitanya Prakash Bapat*
> > > > > > > > > > > > > *+1 (973) 953-6299*
> > > > > > > > > > > > >
> > > > > > > > > > > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > > > > > <https://github.com/ChaiBapchya>[image:
> > > > > > > > > > > > https://www.facebook.com/chaibapat
> > > > > > > > > > > > > ]
> > > > > > > > > > > > > <https://www.facebook.com/chaibapchya>[image:
> > > > > > > > > > > > > https://twitter.com/ChaiBapchya]
<
> > > > > > > https://twitter.com/ChaiBapchya
> > > > > > > > > > > > > [image:
> > > > > > > > > > > > > https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > > > > > <https://www.linkedin.com//in/chaibapchya/>
> > > > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > *Chaitanya Prakash Bapat*
> > > > > > > > > *+1 (973) 953-6299*
> > > > > > > > >
> > > > > > > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > <https://github.com/ChaiBapchya>[image:
> > > > > > > > https://www.facebook.com/chaibapat
> > > > > > > > > ]
> > > > > > > > > <https://www.facebook.com/chaibapchya>[image:
> > > > > > > > > https://twitter.com/ChaiBapchya] <
> > > > https://twitter.com/ChaiBapchya
> > > > > > > > > [image:
> > > > > > > > > https://www.linkedin.com//in/chaibapat25]
> > > > > > > > > <https://www.linkedin.com//in/chaibapchya/>
> > > > > > > > >
> > > > > >
> > > > > > --
> > > > > > Sandeep Krishnamurthy
> > > > > >
> > > >
> > >
> >
> >
> > --
> > *Chaitanya Prakash Bapat*
> > *+1 (973) 953-6299*
> >
> > [image: https://www.linkedin.com//in/chaibapat25]
> > <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
> > ]
> > <https://www.facebook.com/chaibapchya>[image:
> > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> > https://www.linkedin.com//in/chaibapat25]
> > <https://www.linkedin.com//in/chaibapchya/>
> >
> 

Mime
View raw message