mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joern Kottmann <kottm...@gmail.com>
Subject Re: Apache MXNet build failures are mostly valid - verify before merge
Date Thu, 28 Sep 2017 12:02:00 GMT
At Apache OpenNLP we just established among committers that you check
that the status indicator is green before you merge,
and if it wasn't the case then we would ask the committer to take
responsibility and repair things. Works very well our build is never
broken.

We also strongly prefer if each PR gets reviewed by another committer.

Overall this works quite well. We don't use any of the protections
against merging, it is important that you can trust each of the
committers not to break things, if there are problems it is better to
resolve them with talking to each other, rather than enforcing green
status checks.

Jörn

On Thu, Sep 28, 2017 at 8:21 AM, Chris Olivier <cjolivier01@gmail.com> wrote:
> +1 on that
>
> On Wed, Sep 27, 2017 at 11:15 PM Gautam <gautamnitc@gmail.com> wrote:
>
>> Hi Chris,
>>
>>   Here <https://help.github.com/articles/about-protected-branches/> is
>> user
>> document on semantics of protected branch.
>> In short when a branch is protected following applies to that branch.
>>
>>    - Can't be force pushed
>>    - Can't be deleted
>>    - Can't have changes merged into it until required status checks
>>    <https://help.github.com/articles/about-required-status-checks> pass
>>    - Can't have changes merged into it until required reviews are approved
>>    <
>> https://help.github.com/articles/approving-a-pull-request-with-required-reviews
>> >
>>    - Can't be edited or have files uploaded to it from the web
>>    - Can't have changes merged into it until changes to files that
>> have a designated
>>    code owner <https://help.github.com/articles/about-codeowners/> have
>>    been approved by that owner
>>
>>  I am sure many of us might not want to have all these but we can debate on
>> it. My main motive was to "*Can't have changes merged into it until
>> required status checks pass*"
>>
>>
>> -Gautam
>>
>>
>>
>> On Wed, Sep 27, 2017 at 11:09 PM, Chris Olivier <cjolivier01@gmail.com>
>> wrote:
>>
>> > What does that mean? "Protected"? Protected from what?
>> >
>> > On Wed, Sep 27, 2017 at 11:08 PM Gautam <gautamnitc@gmail.com> wrote:
>> >
>> > > Hi Chris,
>> > >
>> > >    I mean make "master branch protected" of  MXNet.
>> > >
>> > > -Gautam
>> > >
>> > > On Wed, Sep 27, 2017 at 11:04 PM, Chris Olivier <cjolivier01@gmail.com
>> >
>> > > wrote:
>> > >
>> > > > What does this mean? "Mx-net branch protected"?
>> > > >
>> > > > On Wed, Sep 27, 2017 at 9:59 PM Tsuyoshi OZAWA <
>> > ozawa.tsuyoshi@gmail.com
>> > > >
>> > > > wrote:
>> > > >
>> > > > > +1,
>> > > > >
>> > > > > While I'm checking the recent build failures, and I think the
>> > decision
>> > > > > of making the mx-net branch protected is necessary for stable
>> > > > > building.
>> > > > > Thanks Kumar for resuming important discussion.
>> > > > >
>> > > > > Best regards
>> > > > > - Tsuyoshi
>> > > > >
>> > > > > On Thu, Sep 28, 2017 at 12:56 PM, Kumar, Gautam <gauta@amazon.com>
>> > > > wrote:
>> > > > > > Reviving the discussion.
>> > > > > >
>> > > > > > At this point of time we have couple of stable builds
>> > > > > >
>> > > > > https://builds.apache.org/view/Incubator%20Projects/job/
>> > > > incubator-mxnet/job/master/448/
>> > > > > >
>> > > > > https://builds.apache.org/view/Incubator%20Projects/job/
>> > > > incubator-mxnet/job/master/449/
>> > > > > >
>> > > > > > Should we have a quick discussion or polling on making the
mx-net
>> > > > branch
>> > > > > protected? If you still think we shouldn’t make it protected
please
>> > > > provide
>> > > > > a reason to support your claim.
>> > > > > >
>> > > > > > Few of us have concern over Jenkin’s stability. If I look
two
>> weeks
>> > > > > back, after upgrading Linux slave to g2.8x and new windows AMI,
we
>> > have
>> > > > not
>> > > > > seen any case where instance died due to high memory usage or
any
>> > > process
>> > > > > got killed due to high cpu usage or any other issue with windows
>> > > slaves.
>> > > > > >
>> > > > > > Going forward we are also planning that if we add any new
slave
>> we
>> > > will
>> > > > > not enable the main load immediately, but rather will do ‘test
>> build’
>> > > to
>> > > > > make sure that new slaves are not causing any infrastructure
issue
>> > and
>> > > > > capable to perform as good as existing slaves.
>> > > > > >
>> > > > > > -Gautam
>> > > > > >
>> > > > > > On 8/31/17, 5:27 PM, "Lupesko, Hagay" <lupesko@gmail.com>
wrote:
>> > > > > >
>> > > > > >     @madan looking into some failures – you’re right…
there’s
>> > > multiple
>> > > > > issues going on, some of them intermittent, and we want to be
able
>> to
>> > > > merge
>> > > > > fixes in.
>> > > > > >     Agreed that we can wait with setting up protected mode
until
>> > > build
>> > > > > stabilizes.
>> > > > > >
>> > > > > >     On 8/31/17, 11:41, "Madan Jampani" <madan.jampani@gmail.com>
>> > > > wrote:
>> > > > > >
>> > > > > >         @hagay: we agree on the end state. I'm not too particular
>> > > about
>> > > > > how we get
>> > > > > >         there. If you think enabling it now and fixes regression
>> > > later
>> > > > > is doable,
>> > > > > >         I'm fine with. I see a bit of a chicken and egg
problem.
>> We
>> > > > need
>> > > > > to get
>> > > > > >         some fixes in even when the status checks are failing.
>> > > > > >
>> > > > > >         On Thu, Aug 31, 2017 at 11:25 AM, Lupesko, Hagay
<
>> > > > > lupesko@gmail.com> wrote:
>> > > > > >
>> > > > > >         > @madan – re: getting to a stable CI first:
>> > > > > >         > I’m concerned that by not enabling protected
branch
>> mode
>> > > > ASAP,
>> > > > > we’re just
>> > > > > >         > taking in more regressions, which makes a stable
build
>> a
>> > > > > moving target for
>> > > > > >         > us…
>> > > > > >         >
>> > > > > >         > On 8/31/17, 10:49, "Zha, Sheng" <zhasheng@amazon.com>
>> > > wrote:
>> > > > > >         >
>> > > > > >         >     Just one thing: please don’t disable
more tests or
>> > just
>> > > > > raise the
>> > > > > >         > tolerance thresholds.
>> > > > > >         >
>> > > > > >         >     Best regards,
>> > > > > >         >     -sz
>> > > > > >         >
>> > > > > >         >     On 8/31/17, 10:45 AM, "Madan Jampani" <
>> > > > > madan.jampani@gmail.com> wrote:
>> > > > > >         >
>> > > > > >         >         +1
>> > > > > >         >         Before we can turn protected mode I
feel we
>> > should
>> > > > > first get to a
>> > > > > >         > stable CI
>> > > > > >         >         pipeline.
>> > > > > >         >         Sandeep is chasing down known breaking
issues.
>> > > > > >         >
>> > > > > >         >
>> > > > > >         >         On Thu, Aug 31, 2017 at 10:27 AM, Hagay
>> Lupesko <
>> > > > > lupesko@gmail.com>
>> > > > > >         > wrote:
>> > > > > >         >
>> > > > > >         >         > Build stability is a major issue,
builds have
>> > > been
>> > > > > failing left
>> > > > > >         > and right
>> > > > > >         >         > over the last week. Some of it
is due to
>> > Jenkins
>> > > > > slave issues,
>> > > > > >         > but some are
>> > > > > >         >         > real regressions.
>> > > > > >         >         > We need to be more strict in the
code we're
>> > > > > committing.
>> > > > > >         >         >
>> > > > > >         >         > I propose we configure our master
to be a
>> > > protected
>> > > > > branch (
>> > > > > >         >         >
>> > > > > https://help.github.com/articles/about-protected-branches/).
>> > > > > >         >         >
>> > > > > >         >         > Thoughts?
>> > > > > >         >         >
>> > > > > >         >         > On 2017-08-28 22:41, sandeep krishnamurthy
<
>> > > > > s...@gmail.com>
>> > > > > >         > wrote:
>> > > > > >         >         > > Hello Committers and Contributors,>
>> > > > > >         >         > >
>> > > > > >         >         > > Due to unstable build pipelines,
from past
>> 1
>> > > > week,
>> > > > > PRs are
>> > > > > >         > being merged>
>> > > > > >         >         > > after CR ignoring PR build
status. Build
>> > > pipeline
>> > > > > is much more
>> > > > > >         > stable
>> > > > > >         >         > than>
>> > > > > >         >         > > last week and most of the
build failures
>> you
>> > > see
>> > > > > from now on,
>> > > > > >         > are likely
>> > > > > >         >         > to>
>> > > > > >         >         > > be a valid failure and hence,
it is
>> > recommended
>> > > > to
>> > > > > wait for PR
>> > > > > >         > builds,
>> > > > > >         >         > see>
>> > > > > >         >         > > the root cause of any build
failures before
>> > > > > proceeding with
>> > > > > >         > merges.>
>> > > > > >         >         > >
>> > > > > >         >         > > At this point of time, there
are 2
>> > intermittent
>> > > > > issue yet to
>> > > > > >         > be fixed ->
>> > > > > >         >         > > * Network error leading to
GitHub requests
>> > > > > throwing 404>
>> > > > > >         >         > > * A conflict in artifacts
generated between
>> > > > > branches/PR -
>> > > > > >         > Cause unknown
>> > > > > >         >         > yet.>
>> > > > > >         >         > > These issues will be fixed
soon.>
>> > > > > >         >         > >
>> > > > > >         >         > >
>> > > > > >         >         > > -- >
>> > > > > >         >         > > Sandeep Krishnamurthy>
>> > > > > >         >         > >
>> > > > > >         >         >
>> > > > > >         >
>> > > > > >         >
>> > > > > >         >
>> > > > > >         >
>> > > > > >         >
>> > > > > >         >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > - Tsuyoshi
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > Best Regards,
>> > > Gautam Kumar
>> > >
>> >
>>
>>
>>
>> --
>> Best Regards,
>> Gautam Kumar
>>

Mime
View raw message