mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Naveen Swamy <mnnav...@gmail.com>
Subject Re: MXNet: Run PR builds on Apache Jenkins only after the commit is reviewed
Date Wed, 13 Sep 2017 16:56:35 GMT
The current number of build steps for a successful build is around 30 each
requiring to run on a Jenkins executor(running on one instance), there are
plans to add additional tests(more build steps) for additional platforms
such as MAC OSX, IOT devices. I don't think it is practical nor necessary
to run the entire pipeline on every PR change(even if you cancel the old
build) before someone has looked into it.  Please see the attached image
that shows the complete set of build steps. Each step not a stage in the
pipeline needs an instance, running on average to 2.5 hours. We have
allocated 20 instances running for the build.

[image: Inline image 1]


While we optimize our builds to better handle changes, become smarter in
what builds to kick off, As a compromise, I suggest we modify the PR builds
to run the following steps in sequence:
1. Sanity Checks(lint, Coverity, etc.,)
2. CPU Build for Linux
3. Python CPU Unit Tests on Linux

On a PR review and request(label or comment) run the existing pipeline.

Also, we can revisit and refine this again after we have optimized the
current build pipeline.

Let me know if you still have concerns, we can chat offline as well to
understand better.

Thanks, Naveen





On Tue, Sep 12, 2017 at 11:17 PM, Kumar, Gautam <gauta@amazon.com> wrote:

> Agreed with Bhavin.
>
> One point which is being mentioned here is abort the previous build if a
> new change is being published on same PR.
> Which can be achievable by just changing the job configuration “Cancel
> build on update” under Trigger setup.
> https://github.com/jenkinsci/ghprb-plugin/issues/379 is the git hub
> discussion page for same.
>
> -Gautam
>
>
> On 9/12/17, 9:43 PM, "Lin, Haibin" <haibilin@amazon.com> wrote:
>
>     Also +1 on this.
>
>     Haibin
>
>     On 9/12/17, 9:28 PM, "Chris Olivier" <cjolivier01@gmail.com> wrote:
>
>         +1 to this
>
>         On Tue, Sep 12, 2017 at 8:48 PM Bhavin Thaker <
> bhavinthaker@gmail.com>
>         wrote:
>
>         > My vote is to make the CI build and test process lightweight and
> efficient
>         > and not involve a human to give a go-ahead for a PR build.
>         >
>         > There are multiple ways to engineer a stable and efficient CI
> system,
>         > (already discussed on this email thread), including canceling
> previously
>         > running build for a particular PR before invoking a new build,
> using
>         > Incredibuild, adding more machines to CI, etc.
>         >
>         > In short, as we scale to many engineers working on MXNet around
> the globe
>         > in different time-zones, precious human involvement should be
> minimal and
>         > be used judiciously only for critical things like code-review
> and only
>         > after sufficient amount of sanity build-tests have passed.
>         >
>         > Let the machines work harder for humans and not the other way
> around.
>         >
>         > Bhavin Thaker.
>         >
>         > On Tue, Sep 12, 2017 at 12:20 PM Chris Olivier <
> cjolivier01@gmail.com>
>         > wrote:
>         >
>         > > The majority of these iterations is to trigger a build on the
> broken CI
>         > in
>         > > hopes it will finally pass...
>         > >
>         > > On Tue, Sep 12, 2017 at 11:15 AM Madan Jampani <
> madan.jampani@gmail.com>
>         > > wrote:
>         > >
>         > > > +1
>         > > > I second only running sanity test (lint) until manual
> approval.
>         > > >
>         > > > On Tue, Sep 12, 2017 at 11:05 AM, Naveen Swamy <
> mnnaveen@gmail.com>
>         > > wrote:
>         > > >
>         > > > > Just to be clear, the proposal is not to remove the PR
> build. It's
>         > only
>         > > > to
>         > > > > delay the PR build until a reviewer has looked at it and
> marks it
>         > > > Approved
>         > > > > or adds a Label to build. Once it's approved and PR build
> succeeds a
>         > > > > reviewer/committer can see the build result and merge to
> the master.
>         > > > >
>         > > > > I don't mean to pick on the author of this PR(Chris), I am
> just using
>         > > > this
>         > > > > build to support my argument.
>         > > > > https://builds.apache.org/view/Incubator%20Projects/job/
>         > > > > incubator-mxnet/view/change-requests/job/PR-7577/
>         > > > >  This has gone through 74 iterations, I understand not all
> of them
>         > are
>         > > > due
>         > > > > to changes and some of them are due to build instability
> and some
>         > dummy
>         > > > > commits to getting the PR build to pass. However, the PR
> has been
>         > under
>         > > > > review and changes being made for the last several days.
I
> don't
>         > think
>         > > it
>         > > > > warrants a build trigger for every new change. Each build
> on average
>         > > > takes
>         > > > > about 2 hours running on all platforms.
>         > > > >
>         > > > > Probably we can run a lean down version of the current PR
> build where
>         > > we
>         > > > > have sanity-test(lint)->build on linux(cpu)->unit test
on
> linux(cpu)
>         > ?
>         > > > >
>         > > > >
>         > > > > Thanks, Naveen
>         > > > >
>         > > > >
>         > > > >
>         > > > >
>         > > > > On Tue, Sep 12, 2017 at 4:27 AM, Joern Kottmann <
> kottmann@gmail.com>
>         > > > > wrote:
>         > > > >
>         > > > > > Not sure how it works with jenkins, but other CI serves
> can look at
>         > > > > > the commit message and skip the CI run based on certain
> commands in
>         > > > > > it.
>         > > > > >
>         > > > > > Might make sense for small changes such as documentation
> updates,
>         > > half
>         > > > > > done PRs, etc.
>         > > > > >
>         > > > > > Jörn
>         > > > > >
>         > > > > > On Tue, Sep 12, 2017 at 11:17 AM, Larroy, Pedro <
>         > pllarroy@amazon.de>
>         > > > > > wrote:
>         > > > > > > Hi
>         > > > > > >
>         > > > > > > I would like to integrate our CI system for devices
to
> make sure
>         > > PRs
>         > > > > > build on ARM / android etc. Who has admin rights on
the
> repository
>         > so
>         > > > we
>         > > > > > can install the necessary hooks to trigger our builds?
>         > > > > > >
>         > > > > > >
>         > > > > > > Kind regards.
>         > > > > > > --
>         > > > > > >
>         > > > > > > Pedro
>         > > > > > >
>         > > > > > > On 12/09/17 02:50, "Meghna Baijal" <
> meghnabaijal2017@gmail.com>
>         > > > wrote:
>         > > > > > >
>         > > > > > >     Hi All,
>         > > > > > >     We would like to initiate a change in the way
the
> PR builds
>         > are
>         > > > > > being triggered. At the moment, every time a Pull
> Request is
>         > > created, a
>         > > > > > build gets triggered on Jenkins. Additional builds also
> get
>         > triggered
>         > > > due
>         > > > > > to changes to the same PR.
>         > > > > > >     Too many PR builds leads to resource starvation
> and very long
>         > > > > queues
>         > > > > > and long build times. Hence we would like to add some
> checks where
>         > a
>         > > > > human
>         > > > > > reviewer manually marks it to something like “ok to
> build” before a
>         > > PR
>         > > > > > build is triggered.
>         > > > > > >
>         > > > > > >     Do you think this approach would be helpful
and we
> should
>         > move
>         > > > > > forward with it?
>         > > > > > >
>         > > > > > >     Thanks,
>         > > > > > >     Meghna Baijal
>         > > > > > >
>         > > > > > >
>         > > > > > >
>         > > > > > >
>         > > > > > >
>         > > > > > >
>         > > > > > > Amazon Development Center Germany GmbH
>         > > > > > > Berlin - Dresden - Aachen
>         > > > > > > main office: Krausenstr. 38, 10117 Berlin
>         > > > > > > Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian
> Schlaeger
>         > > > > > > Ust-ID: DE289237879
>         > > > > > > Eingetragen am Amtsgericht Charlottenburg HRB 149173
B
>         > > > > >
>         > > > >
>         > > >
>         > >
>         >
>
>
>
>
>

Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message