mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kellen sunderland <kellen.sunderl...@gmail.com>
Subject Re: Unit tests removed
Date Thu, 01 Feb 2018 19:25:15 GMT
Hey Mu + Da, sorry to hear that.  I've been working on the CPP tests PR and
iterating on it for quite a while.  I can assure you getting it merged had
nothing to do with this revert from my perceptive.  It's actually a task
assigned to me for Q1 internally at Amazon.

I'm actually completely unfamiliar with what's going regarding the revert,
but I'm quite looking forward to this PR getting merged.  It seemed to
cleanup the code a lot, and I known Asmus and I will be looking closely at
the perf numbers.  (MKL in the past has given us huge speed boosts on a
service we're optimizing).

-Kellen

On Thu, Feb 1, 2018 at 8:11 PM, Mu Li <muli.cmu@gmail.com> wrote:

> I think simply reverting the MKL PR that Da has been working for a half
> year and immediately enabling the cpp tests in the CI without reaching an
> agreement with Da seriously hurts his feelings.
>
> I put the details at the end of
> https://github.com/apache/incubator-mxnet/pull/9661
>
>
> On Thu, Feb 1, 2018 at 10:42 AM, Chris Olivier <cjolivier01@gmail.com>
> wrote:
>
> > I have created starter changes for converting to the CoreOpExecutor. This
> > code compiles the first test.
> >
> > See notes in the PR description:
> >
> > https://github.com/cjolivier01/mxnet/pull/2
> >
> > On Wed, Jan 31, 2018 at 9:47 PM, Zhao, Patric <patric.zhao@intel.com>
> > wrote:
> >
> > > Thanks, Chris, I agree the quality is the most important thing.
> > >
> > > We will try to enable these cases ASAP.
> > >
> > > ---Patric
> > >
> > > > -----Original Message-----
> > > > From: Chris Olivier [mailto:cjolivier01@gmail.com]
> > > > Sent: Thursday, February 1, 2018 1:04 PM
> > > > To: dev@mxnet.incubator.apache.org
> > > > Subject: Re: Unit tests removed
> > > >
> > > > C++ Unit tests test correctness of output as well as consistency of
> > *all
> > > > inputs and outputs* between different implementations (mkl, cpu, gpu,
> > > > cudnn, etc.)
> > > >
> > > > C++ Unit tests also test that varying channel axes produce the
> expected
> > > > numeric distributions.
> > > >
> > > > In addition, *batch norm tests in test_operator.py are disabled due
> to
> > CI
> > > > issues*, so this leaves batch norm largely untested (even though
> those
> > > tests
> > > > only tested numeric gradient and nothing else -- they were existing
> > tests
> > > > before I re-factored batch norm and they were weak to begin with).
> > > >
> > > > C++ unit tests also measure performance differences between the
> various
> > > > versions (cpu, mkl, gpu, cudnn, etc).
> > > > I've made several comments over the course of this mkl-dnn project
> that
> > > the
> > > > unit tests need to be fixed to adjust to the refactor, so this
> > shouldn't
> > > br a
> > > > surprise to anyone.
> > > >
> > > > Hacking out unit tests when it's inconvenient to fix them because of
> > your
> > > > code changes sets a *dangerous precedent*. What would the rule be?
> Is
> > it
> > > > "don't remove tests unless your PR is really big and it would take
> you
> > a
> > > > couple of days to fix them"?  We aren't talking about 1-2 tests,
> either
> > > --
> > > > we're talking about a LOT of tests that test various aspects of the
> > > operators.
> > > >
> > > > As for Intel, they are free to use the PR branch, and even if they
> > > can't, I don't
> > > > think that (or anything else) justifies compromising quality.
> > > >
> > > >
> > > > On Wed, Jan 31, 2018 at 8:45 PM, Haibin Lin <
> haibin.lin.aws@gmail.com>
> > > > wrote:
> > > >
> > > > > Good catch.
> > > > >
> > > > > In general, I agree that tests are not supposed to removed,
> although
> > > > > CI is not running any of these cpp tests just yet. Usually unit
> tests
> > > > > in python for individual operators should be sufficient to test the
> > > > > correctness of operators (although I don't know how/if python tests
> > > > > can run on edge devices like iOS/Android) and it's not feasible to
> > > > > duplicate all operator python tests with their cpp counterparts.
> For
> > > > > functionalities that are not exposed to python or memory checks,
I
> do
> > > > > agree that cpp tests are necessary.
> > > > >
> > > > > I am curious what are exactly tested in those cpp tests? I know
> that
> > > > > BatchNorm ops are tested in python already, but I don't have a full
> > > > > picture of all cpp tests and whether we want to re-implement them.
> > > > >
> > > > > On the other hand, be aware that the Intel team have a few MKL-DNN
> > > > > related projects depending on this PR. Reverting the PR will be a
> > > blocker for
> > > > them.
> > > > >
> > > > > Best,
> > > > > Haibin
> > > > >
> > > > > On Wed, Jan 31, 2018 at 3:30 PM, Marco de Abreu <
> > > > > marco.g.abreu@googlemail.com> wrote:
> > > > >
> > > > > > Here's a link:
> > > > > > https://github.com/apache/incubator-mxnet/pull/8302#
> > > > > discussion_r165204667
> > > > > >
> > > > > > -Marco
> > > > > >
> > > > > > 2018-01-31 15:23 GMT-08:00 Marco de Abreu
> > > > > > <marco.g.abreu@googlemail.com>
> > > > > :
> > > > > >
> > > > > > > Hi Chris,
> > > > > > >
> > > > > > > considering the size of that PR, could you provide a direct
> link
> > > > > > > to the changes you're addressing?
> > > > > > >
> > > > > > > -Marco
> > > > > > >
> > > > > > > 2018-01-31 13:39 GMT-08:00 Chris Olivier <
> cjolivier01@gmail.com
> > >:
> > > > > > >
> > > > > > >> This PR was just merged that removed some 30 or so
C++ unit
> > tests
> > > > > > >> for batch norm operator.
> > > > > > >>
> > > > > > >> https://github.com/apache/incubator-mxnet/pull/8302
> > > > > > >>
> > > > > > >> Is this ok?
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message