mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Larroy, Pedro" <pllar...@amazon.de>
Subject Re: [Proposal] Stabilizing Apache MXNet CI build system
Date Thu, 09 Nov 2017 12:40:51 GMT
Thanks a lot for the document and leading the discussion.

Does anybody have experience with a build system other than Jenkins? In the document we mention
Teamcity as a possible option, and there’s also the second leading open source CI tool “Buildbot”
which is not mentioned.

I’m not sure if we have strong evidence to have an informed decision about using something
other than Jenkins, also from the document I get that the negatives of Jenkins are pretty
minor compared to the other frameworks.

I would be interested to read if somebody has used any other framework in depth and is willing
to vote against using Jenkins so we can all do an informed vote.

I don’t feel comfortable voting for Jenkins because is the only one I know as well.

Kind regards.
-- 

Pedro

On 08/11/17 23:41, "Meghna Baijal" <meghnabaijal2017@gmail.com> wrote:

    Thanks for the active discussion on the document for the new CI for MXNet.
    Now that many of you have reviewed it, do you think I should start a vote
    on which framework the community wants to move forward with ?
    
    Thanks,
    Meghna
    
    On Mon, Nov 6, 2017 at 6:59 PM, Chris Olivier <cjolivier01@gmail.com> wrote:
    
    > After a decision is reached, i am willing to add tasks to Apache MXNet JIRA
    >
    > On Mon, Nov 6, 2017 at 6:15 AM, Pedro Larroy <pedro.larroy.lists@gmail.com
    > >
    > wrote:
    >
    > > Thanks for setting up the document guys, looks like a solid basis to
    > > start to work on!
    > >
    > > Marco, Kellen and I have already added some comments.
    > >
    > > Pedro
    > >
    > >
    > > On Sun, Nov 5, 2017 at 3:43 AM, Meghna Baijal
    > > <meghnabaijal2017@gmail.com> wrote:
    > > > Kellen, Thank you for your comments in the doc.
    > > > Sure Steffen, I will continue to merge everyone’s comments into the doc
    > > and
    > > > work with Pedro to finalize it.
    > > > And then we can vote on the options.
    > > >
    > > > Thanks,
    > > > Meghna Baijal
    > > >
    > > >
    > > > On Sat, Nov 4, 2017 at 6:34 AM, Steffen Rochel <
    > steffenrochel@gmail.com>
    > > > wrote:
    > > >
    > > >> Sandeep and Meghna have been working in background collecting input
    > and
    > > >> preparing a doc. I suggest to drive discussion forward and would like
    > to
    > > >> ask everybody to contribute to
    > > >> https://docs.google.com/document/d/17PEasQ2VWrXi2Cf7IGZSWGZMawxDk
    > > >> dlavUDASzUmLjk/edit?usp=sharing
    > > >>
    > > >> Lets converge on requirements and architecture, so we can move forward
    > > with
    > > >> implementation.
    > > >>
    > > >> I would like to suggest for Pedro  and Meghna to lead the discussion
    > and
    > > >> help to resolve suggestions.
    > > >>
    > > >> I assume we need a vote once we are converged on a good draft to call
    > > it a
    > > >> plan and move forward with implementation. As we all are unhappy with
    > > the
    > > >> current CI situation I would also suggest a phased approach, so we
can
    > > get
    > > >> back to reliable and efficient basic CI quickly and add advanced
    > > >> capabilities over time.
    > > >>
    > > >> Steffen
    > > >>
    > > >> On Wed, Nov 1, 2017 at 1:14 PM kellen sunderland <
    > > >> kellen.sunderland@gmail.com> wrote:
    > > >>
    > > >> > Hey Henri, I think that's what a few of us are advocating.  Running
    > a
    > > set
    > > >> > of quick tests as part of the PR process, and then a more detailed
    > > >> > regression test suite periodically (say every 4 hours). This fits
    > > nicely
    > > >> > into a tagging or 2 branch development system.  Commits will be
    > tagged
    > > >> (or
    > > >> > merged into a stable branch) as soon as they pass the detailed
    > > regression
    > > >> > testing.
    > > >> >
    > > >> > On Wed, Nov 1, 2017 at 9:07 PM, Hen <bayard@apache.org>
wrote:
    > > >> >
    > > >> > > Random question - can the CI be split such that the Apache
CI is
    > > doing
    > > >> a
    > > >> > > basic set of checks on that hardware, and is hooked to a
PR, while
    > > >> there
    > > >> > is
    > > >> > > a larger "Is trunk good for release?" test that is running
    > > periodically
    > > >> > > rather than on every PR?
    > > >> > >
    > > >> > > ie: do we need each PR to be run on varied hardware, or can
we
    > have
    > > >> this
    > > >> > > two tier approach?
    > > >> > >
    > > >> > > Hen
    > > >> > >
    > > >> > > On Fri, Oct 20, 2017 at 1:01 PM, sandeep krishnamurthy <
    > > >> > > sandeep.krishna98@gmail.com> wrote:
    > > >> > >
    > > >> > > > Hello all,
    > > >> > > >
    > > >> > > > I am hereby opening up a discussion thread on how we
can
    > stabilize
    > > >> > Apache
    > > >> > > > MXNet CI build system.
    > > >> > > >
    > > >> > > > Problems:
    > > >> > > >
    > > >> > > > ========
    > > >> > > >
    > > >> > > > Recently, we have seen following issues with Apache
MXNet CI
    > build
    > > >> > > systems:
    > > >> > > >
    > > >> > > >    1. Apache Jenkins master is overloaded and we see
issues
    > like -
    > > >> > unable
    > > >> > > >    to trigger builds, difficult to load and view the
blue ocean
    > > and
    > > >> > other
    > > >> > > >    Jenkins build status page.
    > > >> > > >    2. We are generating too many request/interaction
on Apache
    > > Infra
    > > >> > > team.
    > > >> > > >       1. Addition/deletion of new slave: Caused from
scaling
    > > >> activity,
    > > >> > > >       recycling, troubleshooting or any actions leading
to
    > change
    > > of
    > > >> > > slave
    > > >> > > >       machines.
    > > >> > > >       2. Plugins / other Jenkins Master configurations.
    > > >> > > >       3. Experimentation on CI pipelines.
    > > >> > > >    3. Harder to debug and resolve issues - Since access
to
    > master
    > > and
    > > >> > > slave
    > > >> > > >    is not with the same community, it requires Infra
and
    > > community to
    > > >> > > dive
    > > >> > > >    deep together on all action items.
    > > >> > > >
    > > >> > > > Possible Solutions:
    > > >> > > >
    > > >> > > > ==============
    > > >> > > >
    > > >> > > >    1. Can we set up a separate Jenkins CI build system
for
    > Apache
    > > >> MXNet
    > > >> > > >    outside Apache Infra?
    > > >> > > >    2. Can we have a separate Jenkins Master in Apache
Infra for
    > > >> MXNet?
    > > >> > > >    3. Review design of current setup, refine and fill
the gaps.
    > > >> > > >
    > > >> > > > @ Mentors/Infra team/Community:
    > > >> > > >
    > > >> > > > ==========================
    > > >> > > >
    > > >> > > > Please provide your suggestions on how we can proceed
further
    > and
    > > >> work
    > > >> > on
    > > >> > > > stabilizing the CI build systems for MXNet.
    > > >> > > >
    > > >> > > > Also, if the community decides on separate Jenkins CI
build
    > > system,
    > > >> > what
    > > >> > > > important points should be taken care of apart from
the below:
    > > >> > > >
    > > >> > > >    1. Community being able to access the build page
for build
    > > >> statuses.
    > > >> > > >    2. Committers being able to login with apache credentials.
    > > >> > > >    3. Hook setup from apache/incubator-mxnet repo to
Jenkins
    > > master.
    > > >> > > >
    > > >> > > >
    > > >> > > > Irrespective of the solution we come up, I think we
should
    > > initiate a
    > > >> > > > technical design discussion on how to setup the CI build
system.
    > > >> > > Probably 1
    > > >> > > > or 2 pager documents with the architecture and review
with Infra
    > > and
    > > >> > > > community members.
    > > >> > > >
    > > >> > > > ***There were few proposal and discussion on the slack
channel,
    > to
    > > >> > reach
    > > >> > > > wider community members, moving that discussion formally
to this
    > > >> list.
    > > >> > > >
    > > >> > > >
    > > >> > > > My Proposal: Option 1 - Set up separate Jenkins CI build
system.
    > > >> > > >
    > > >> > > > Thanks,
    > > >> > > >
    > > >> > > > Sandeep
    > > >> > > >
    > > >> > > >
    > > >> > > >
    > > >> > > > --
    > > >> > > > Sandeep Krishnamurthy
    > > >> > > >
    > > >> > >
    > > >> >
    > > >>
    > >
    >
    

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
Mime
View raw message