mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meghna Baijal <meghnabaijal2...@gmail.com>
Subject Re: [Proposal] Stabilizing Apache MXNet CI build system
Date Sun, 05 Nov 2017 02:43:59 GMT
Kellen, Thank you for your comments in the doc.
Sure Steffen, I will continue to merge everyone’s comments into the doc and
work with Pedro to finalize it.
And then we can vote on the options.

Thanks,
Meghna Baijal


On Sat, Nov 4, 2017 at 6:34 AM, Steffen Rochel <steffenrochel@gmail.com>
wrote:

> Sandeep and Meghna have been working in background collecting input and
> preparing a doc. I suggest to drive discussion forward and would like to
> ask everybody to contribute to
> https://docs.google.com/document/d/17PEasQ2VWrXi2Cf7IGZSWGZMawxDk
> dlavUDASzUmLjk/edit?usp=sharing
>
> Lets converge on requirements and architecture, so we can move forward with
> implementation.
>
> I would like to suggest for Pedro  and Meghna to lead the discussion and
> help to resolve suggestions.
>
> I assume we need a vote once we are converged on a good draft to call it a
> plan and move forward with implementation. As we all are unhappy with the
> current CI situation I would also suggest a phased approach, so we can get
> back to reliable and efficient basic CI quickly and add advanced
> capabilities over time.
>
> Steffen
>
> On Wed, Nov 1, 2017 at 1:14 PM kellen sunderland <
> kellen.sunderland@gmail.com> wrote:
>
> > Hey Henri, I think that's what a few of us are advocating.  Running a set
> > of quick tests as part of the PR process, and then a more detailed
> > regression test suite periodically (say every 4 hours). This fits nicely
> > into a tagging or 2 branch development system.  Commits will be tagged
> (or
> > merged into a stable branch) as soon as they pass the detailed regression
> > testing.
> >
> > On Wed, Nov 1, 2017 at 9:07 PM, Hen <bayard@apache.org> wrote:
> >
> > > Random question - can the CI be split such that the Apache CI is doing
> a
> > > basic set of checks on that hardware, and is hooked to a PR, while
> there
> > is
> > > a larger "Is trunk good for release?" test that is running periodically
> > > rather than on every PR?
> > >
> > > ie: do we need each PR to be run on varied hardware, or can we have
> this
> > > two tier approach?
> > >
> > > Hen
> > >
> > > On Fri, Oct 20, 2017 at 1:01 PM, sandeep krishnamurthy <
> > > sandeep.krishna98@gmail.com> wrote:
> > >
> > > > Hello all,
> > > >
> > > > I am hereby opening up a discussion thread on how we can stabilize
> > Apache
> > > > MXNet CI build system.
> > > >
> > > > Problems:
> > > >
> > > > ========
> > > >
> > > > Recently, we have seen following issues with Apache MXNet CI build
> > > systems:
> > > >
> > > >    1. Apache Jenkins master is overloaded and we see issues like -
> > unable
> > > >    to trigger builds, difficult to load and view the blue ocean and
> > other
> > > >    Jenkins build status page.
> > > >    2. We are generating too many request/interaction on Apache Infra
> > > team.
> > > >       1. Addition/deletion of new slave: Caused from scaling
> activity,
> > > >       recycling, troubleshooting or any actions leading to change of
> > > slave
> > > >       machines.
> > > >       2. Plugins / other Jenkins Master configurations.
> > > >       3. Experimentation on CI pipelines.
> > > >    3. Harder to debug and resolve issues - Since access to master and
> > > slave
> > > >    is not with the same community, it requires Infra and community to
> > > dive
> > > >    deep together on all action items.
> > > >
> > > > Possible Solutions:
> > > >
> > > > ==============
> > > >
> > > >    1. Can we set up a separate Jenkins CI build system for Apache
> MXNet
> > > >    outside Apache Infra?
> > > >    2. Can we have a separate Jenkins Master in Apache Infra for
> MXNet?
> > > >    3. Review design of current setup, refine and fill the gaps.
> > > >
> > > > @ Mentors/Infra team/Community:
> > > >
> > > > ==========================
> > > >
> > > > Please provide your suggestions on how we can proceed further and
> work
> > on
> > > > stabilizing the CI build systems for MXNet.
> > > >
> > > > Also, if the community decides on separate Jenkins CI build system,
> > what
> > > > important points should be taken care of apart from the below:
> > > >
> > > >    1. Community being able to access the build page for build
> statuses.
> > > >    2. Committers being able to login with apache credentials.
> > > >    3. Hook setup from apache/incubator-mxnet repo to Jenkins master.
> > > >
> > > >
> > > > Irrespective of the solution we come up, I think we should initiate a
> > > > technical design discussion on how to setup the CI build system.
> > > Probably 1
> > > > or 2 pager documents with the architecture and review with Infra and
> > > > community members.
> > > >
> > > > ***There were few proposal and discussion on the slack channel, to
> > reach
> > > > wider community members, moving that discussion formally to this
> list.
> > > >
> > > >
> > > > My Proposal: Option 1 - Set up separate Jenkins CI build system.
> > > >
> > > > Thanks,
> > > >
> > > > Sandeep
> > > >
> > > >
> > > >
> > > > --
> > > > Sandeep Krishnamurthy
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message