mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@googlemail.com.INVALID>
Subject Re: CI impaired
Date Fri, 30 Nov 2018 14:53:29 GMT
Hello,

I'm now moving forward with #1. I will try to get to #3 as soon as possible
to reduce parallel jobs in our CI. You might notice some unfinished jobs. I
will let you know as soon as this process has been completed. Until then,
please bare with me since we have hundreds of jobs to run in order to
validate all PRs.

Best regards,
Marco

On Fri, Nov 30, 2018 at 1:36 AM Marco de Abreu <marco.g.abreu@googlemail.com>
wrote:

> Hello,
>
> since the release branch has now been cut, I would like to move forward
> with the CI improvements for the master branch. This would include the
> following actions:
> 1. Re-enable the new Jenkins job
> 2. Request Apache Infra to move the protected branch check from the main
> pipeline to our new ones
> 3. Merge https://github.com/apache/incubator-mxnet/pull/13474 - this
> finalizes the deprecation process
>
> If nobody objects, I would like to start with #1 soon. Mentors, could you
> please assist to create the Apache Infra ticket? I would then take it from
> there and talk to Infra.
>
> Best regards,
> Marco
>
> On Mon, Nov 26, 2018 at 2:47 AM kellen sunderland <
> kellen.sunderland@gmail.com> wrote:
>
>> Sorry, [1] meant to reference
>> https://issues.jenkins-ci.org/browse/JENKINS-37984 .
>>
>> On Sun, Nov 25, 2018 at 5:41 PM kellen sunderland <
>> kellen.sunderland@gmail.com> wrote:
>>
>> > Marco and I ran into another urgent issue over the weekend that was
>> > causing builds to fail.  This issue was unrelated to any feature
>> > development work, or other CI fixes applied recently, but it did require
>> > quite a bit of work from Marco (and a little from me) to fix.
>> >
>> > We spent enough time on the problem that it caused us to take a step
>> back
>> > and consider how we could both fix issues in CI and support the 1.4
>> release
>> > with the least impact possible on MXNet devs.  Marco had planned to
>> make a
>> > significant change to the CI to fix a long-standing Jenkins error [1],
>> but
>> > we feel that most developers would prioritize having a stable build
>> > environment for the next few weeks over having this fix in place.
>> >
>> > To properly introduce a new CI system the intent was to do a gradual
>> > blue/green roll out of the fix.  To manage this rollout would have taken
>> > operational effort and double compute load as we run systems in
>> parallel.
>> > This risks outages due to scaling limits, and we’d rather make this
>> change
>> > during a period of low-developer activity, i.e. shortly after the 1.4
>> > release.
>> >
>> > This means that from now until the 1.4 release, in order to reduce
>> > complexity MXNet developers should only see a single Jenkins
>> verification
>> > check, and a single Travis check.
>> >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message