mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Larroy <pedro.larroy.li...@gmail.com>
Subject Re: CI and PRs
Date Wed, 14 Aug 2019 17:32:40 GMT
Yes another point is that pushing again to the PR should cancel previous
builds which is now not happening which wastes resources.

Any ideas how to make connection errors more robust? The Ivy cache for JVM
packages for example could be pre-populated in the workers. It's a balance
between complexity and efficiency and simplicity.

Maybe maven has some settings to retry download failures for example. For
failures downloading gpg keys we just stored them in the repository to
avoid networking problems.


On Wed, Aug 14, 2019 at 9:39 AM Chaitanya Bapat <chai.bapat@gmail.com>
wrote:

> Pedro,
>
> great job of summarizing the set of tasks to restore CI's glory!
> As far as your list goes,
>
> > * Address unit tests that take more than 10-20s, streamline them or move
> > them to nightly if it can't be done.
>
> I would like to call out this request specifically. I'm tracking # of
> timeouts that happen (and this is by no means an exhaustive list) - PR
> #15880 <https://github.com/apache/incubator-mxnet/issues/15880>
> It's unreasonable for CI to run tests for 3 hours. So, we do need to
> address this issue with greater intent.
>
> Moreover, to add to the tale of CI woes, we should make it robust enough
> for network connection errors.
> At times, CI fails due to inability to fetch some packages.
> 1. Error log doesn't mention corrective action (on the part of PR author -
> "to retrigger the CI")
> 2. Would have been great had CI handled it smartly (or some sort of way to
> fasten the process of passing the CI)
>
> Hopefully, with the help of community, we would be able to catch exceptions
> and make CI great again!
>
>
> On Wed, 14 Aug 2019 at 05:09, Carin Meier <carinmeier@gmail.com> wrote:
>
> > Before any binding tests are moved to nightly, I think we need to figure
> > out how the community can get proper notifications of failure and success
> > on those nightly runs. Otherwise, I think that breakages would go
> > unnoticed.
> >
> > -Carin
> >
> > On Tue, Aug 13, 2019 at 7:47 PM Pedro Larroy <
> pedro.larroy.lists@gmail.com
> > >
> > wrote:
> >
> > > Hi
> > >
> > > Seems we are hitting some problems in CI. I propose the following
> action
> > > items to remedy the situation and accelerate turn around times in CI,
> > > reduce cost, complexity and probability of failure blocking PRs and
> > > frustrating developers:
> > >
> > > * Upgrade Windows visual studio from VS 2015 to VS 2017. The
> > > build_windows.py infrastructure should easily work with the new
> version.
> > > Currently some PRs are blocked by this:
> > > https://github.com/apache/incubator-mxnet/issues/13958
> > > * Move Gluon Model zoo tests to nightly. Tracked at
> > > https://github.com/apache/incubator-mxnet/issues/15295
> > > * Move non-python bindings tests to nightly. If a commit is touching
> > other
> > > bindings, the reviewer should ask for a full run which can be done
> > locally,
> > > use the label bot to trigger a full CI build, or defer to nightly.
> > > * Provide a couple of basic sanity performance tests on small models
> that
> > > are run on CI and can be echoed by the label bot as a comment for PRs.
> > > * Address unit tests that take more than 10-20s, streamline them or
> move
> > > them to nightly if it can't be done.
>
> > * Open sourcing the remaining CI infrastructure scripts so the community
> > > can contribute.
> > >
> > > I think our goal should be turnaround under 30min.
> > >
> > > I would also like to touch base with the community that some PRs are
> not
> > > being followed up by committers asking for changes. For example this PR
> > is
> > > importtant and is hanging for a long time.
> > >
> > > https://github.com/apache/incubator-mxnet/pull/15051
> > >
> > > This is another, less important but more trivial to review:
> > >
> > > https://github.com/apache/incubator-mxnet/pull/14940
> > >
> > > I think comitters requesting changes and not folllowing up in
> reasonable
> > > time is not healthy for the project. I suggest configuring github
> > > Notifications for a good SNR and following up.
> > >
> > > Regards.
> > >
> > > Pedro.
> > >
> >
>
>
> --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
> ]
> <https://www.facebook.com/chaibapchya>[image:
> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> https://www.linkedin.com//in/chaibapat25]
> <https://www.linkedin.com//in/chaibapchya/>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message