mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Olivier <cjolivie...@gmail.com>
Subject Re: [Proposal] Stabilizing Apache MXNet CI build system
Date Tue, 07 Nov 2017 02:59:02 GMT
After a decision is reached, i am willing to add tasks to Apache MXNet JIRA

On Mon, Nov 6, 2017 at 6:15 AM, Pedro Larroy <pedro.larroy.lists@gmail.com>
wrote:

> Thanks for setting up the document guys, looks like a solid basis to
> start to work on!
>
> Marco, Kellen and I have already added some comments.
>
> Pedro
>
>
> On Sun, Nov 5, 2017 at 3:43 AM, Meghna Baijal
> <meghnabaijal2017@gmail.com> wrote:
> > Kellen, Thank you for your comments in the doc.
> > Sure Steffen, I will continue to merge everyone’s comments into the doc
> and
> > work with Pedro to finalize it.
> > And then we can vote on the options.
> >
> > Thanks,
> > Meghna Baijal
> >
> >
> > On Sat, Nov 4, 2017 at 6:34 AM, Steffen Rochel <steffenrochel@gmail.com>
> > wrote:
> >
> >> Sandeep and Meghna have been working in background collecting input and
> >> preparing a doc. I suggest to drive discussion forward and would like to
> >> ask everybody to contribute to
> >> https://docs.google.com/document/d/17PEasQ2VWrXi2Cf7IGZSWGZMawxDk
> >> dlavUDASzUmLjk/edit?usp=sharing
> >>
> >> Lets converge on requirements and architecture, so we can move forward
> with
> >> implementation.
> >>
> >> I would like to suggest for Pedro  and Meghna to lead the discussion and
> >> help to resolve suggestions.
> >>
> >> I assume we need a vote once we are converged on a good draft to call
> it a
> >> plan and move forward with implementation. As we all are unhappy with
> the
> >> current CI situation I would also suggest a phased approach, so we can
> get
> >> back to reliable and efficient basic CI quickly and add advanced
> >> capabilities over time.
> >>
> >> Steffen
> >>
> >> On Wed, Nov 1, 2017 at 1:14 PM kellen sunderland <
> >> kellen.sunderland@gmail.com> wrote:
> >>
> >> > Hey Henri, I think that's what a few of us are advocating.  Running a
> set
> >> > of quick tests as part of the PR process, and then a more detailed
> >> > regression test suite periodically (say every 4 hours). This fits
> nicely
> >> > into a tagging or 2 branch development system.  Commits will be tagged
> >> (or
> >> > merged into a stable branch) as soon as they pass the detailed
> regression
> >> > testing.
> >> >
> >> > On Wed, Nov 1, 2017 at 9:07 PM, Hen <bayard@apache.org> wrote:
> >> >
> >> > > Random question - can the CI be split such that the Apache CI is
> doing
> >> a
> >> > > basic set of checks on that hardware, and is hooked to a PR, while
> >> there
> >> > is
> >> > > a larger "Is trunk good for release?" test that is running
> periodically
> >> > > rather than on every PR?
> >> > >
> >> > > ie: do we need each PR to be run on varied hardware, or can we have
> >> this
> >> > > two tier approach?
> >> > >
> >> > > Hen
> >> > >
> >> > > On Fri, Oct 20, 2017 at 1:01 PM, sandeep krishnamurthy <
> >> > > sandeep.krishna98@gmail.com> wrote:
> >> > >
> >> > > > Hello all,
> >> > > >
> >> > > > I am hereby opening up a discussion thread on how we can stabilize
> >> > Apache
> >> > > > MXNet CI build system.
> >> > > >
> >> > > > Problems:
> >> > > >
> >> > > > ========
> >> > > >
> >> > > > Recently, we have seen following issues with Apache MXNet CI
build
> >> > > systems:
> >> > > >
> >> > > >    1. Apache Jenkins master is overloaded and we see issues like
-
> >> > unable
> >> > > >    to trigger builds, difficult to load and view the blue ocean
> and
> >> > other
> >> > > >    Jenkins build status page.
> >> > > >    2. We are generating too many request/interaction on Apache
> Infra
> >> > > team.
> >> > > >       1. Addition/deletion of new slave: Caused from scaling
> >> activity,
> >> > > >       recycling, troubleshooting or any actions leading to change
> of
> >> > > slave
> >> > > >       machines.
> >> > > >       2. Plugins / other Jenkins Master configurations.
> >> > > >       3. Experimentation on CI pipelines.
> >> > > >    3. Harder to debug and resolve issues - Since access to master
> and
> >> > > slave
> >> > > >    is not with the same community, it requires Infra and
> community to
> >> > > dive
> >> > > >    deep together on all action items.
> >> > > >
> >> > > > Possible Solutions:
> >> > > >
> >> > > > ==============
> >> > > >
> >> > > >    1. Can we set up a separate Jenkins CI build system for Apache
> >> MXNet
> >> > > >    outside Apache Infra?
> >> > > >    2. Can we have a separate Jenkins Master in Apache Infra for
> >> MXNet?
> >> > > >    3. Review design of current setup, refine and fill the gaps.
> >> > > >
> >> > > > @ Mentors/Infra team/Community:
> >> > > >
> >> > > > ==========================
> >> > > >
> >> > > > Please provide your suggestions on how we can proceed further
and
> >> work
> >> > on
> >> > > > stabilizing the CI build systems for MXNet.
> >> > > >
> >> > > > Also, if the community decides on separate Jenkins CI build
> system,
> >> > what
> >> > > > important points should be taken care of apart from the below:
> >> > > >
> >> > > >    1. Community being able to access the build page for build
> >> statuses.
> >> > > >    2. Committers being able to login with apache credentials.
> >> > > >    3. Hook setup from apache/incubator-mxnet repo to Jenkins
> master.
> >> > > >
> >> > > >
> >> > > > Irrespective of the solution we come up, I think we should
> initiate a
> >> > > > technical design discussion on how to setup the CI build system.
> >> > > Probably 1
> >> > > > or 2 pager documents with the architecture and review with Infra
> and
> >> > > > community members.
> >> > > >
> >> > > > ***There were few proposal and discussion on the slack channel,
to
> >> > reach
> >> > > > wider community members, moving that discussion formally to this
> >> list.
> >> > > >
> >> > > >
> >> > > > My Proposal: Option 1 - Set up separate Jenkins CI build system.
> >> > > >
> >> > > > Thanks,
> >> > > >
> >> > > > Sandeep
> >> > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Sandeep Krishnamurthy
> >> > > >
> >> > >
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message