mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@gmail.com>
Subject Re: CI and PRs
Date Wed, 14 Aug 2019 18:47:50 GMT
Hi,

we record a bunch of metrics about run statistics (down to the duration of
every individual step). If you tell me which ones you're particularly
interested in (probably total duration of each node in the test stage), I'm
happy to provide them.

Dimensions are (in hierarchical order):
- job
- branch
- stage
- node
- step

Unfortunately I don't have the possibility to export them since we store
them in CloudWatch Metrics which afaik doesn't offer raw exports.

Best regards,
Marco

Carin Meier <carinmeier@gmail.com> schrieb am Mi., 14. Aug. 2019, 19:43:

> I would prefer to keep the language binding in the PR process. Perhaps we
> could do some analytics to see how much each of the language bindings is
> contributing to overall run time.
> If we have some metrics on that, maybe we can come up with a guideline of
> how much time each should take. Another possibility is leverage the
> parallel builds more.
>
> On Wed, Aug 14, 2019 at 1:30 PM Pedro Larroy <pedro.larroy.lists@gmail.com
> >
> wrote:
>
> > Hi Carin.
> >
> > That's a good point, all things considered would your preference be to
> keep
> > the Clojure tests as part of the PR process or in Nightly?
> > Some options are having notifications here or in slack. But if we think
> > breakages would go unnoticed maybe is not a good idea to fully remove
> > bindings from the PR process and just streamline the process.
> >
> > Pedro.
> >
> > On Wed, Aug 14, 2019 at 5:09 AM Carin Meier <carinmeier@gmail.com>
> wrote:
> >
> > > Before any binding tests are moved to nightly, I think we need to
> figure
> > > out how the community can get proper notifications of failure and
> success
> > > on those nightly runs. Otherwise, I think that breakages would go
> > > unnoticed.
> > >
> > > -Carin
> > >
> > > On Tue, Aug 13, 2019 at 7:47 PM Pedro Larroy <
> > pedro.larroy.lists@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hi
> > > >
> > > > Seems we are hitting some problems in CI. I propose the following
> > action
> > > > items to remedy the situation and accelerate turn around times in CI,
> > > > reduce cost, complexity and probability of failure blocking PRs and
> > > > frustrating developers:
> > > >
> > > > * Upgrade Windows visual studio from VS 2015 to VS 2017. The
> > > > build_windows.py infrastructure should easily work with the new
> > version.
> > > > Currently some PRs are blocked by this:
> > > > https://github.com/apache/incubator-mxnet/issues/13958
> > > > * Move Gluon Model zoo tests to nightly. Tracked at
> > > > https://github.com/apache/incubator-mxnet/issues/15295
> > > > * Move non-python bindings tests to nightly. If a commit is touching
> > > other
> > > > bindings, the reviewer should ask for a full run which can be done
> > > locally,
> > > > use the label bot to trigger a full CI build, or defer to nightly.
> > > > * Provide a couple of basic sanity performance tests on small models
> > that
> > > > are run on CI and can be echoed by the label bot as a comment for
> PRs.
> > > > * Address unit tests that take more than 10-20s, streamline them or
> > move
> > > > them to nightly if it can't be done.
> > > > * Open sourcing the remaining CI infrastructure scripts so the
> > community
> > > > can contribute.
> > > >
> > > > I think our goal should be turnaround under 30min.
> > > >
> > > > I would also like to touch base with the community that some PRs are
> > not
> > > > being followed up by committers asking for changes. For example this
> PR
> > > is
> > > > importtant and is hanging for a long time.
> > > >
> > > > https://github.com/apache/incubator-mxnet/pull/15051
> > > >
> > > > This is another, less important but more trivial to review:
> > > >
> > > > https://github.com/apache/incubator-mxnet/pull/14940
> > > >
> > > > I think comitters requesting changes and not folllowing up in
> > reasonable
> > > > time is not healthy for the project. I suggest configuring github
> > > > Notifications for a good SNR and following up.
> > > >
> > > > Regards.
> > > >
> > > > Pedro.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message