mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Olivier <cjolivie...@gmail.com>
Subject Re: Unit tests removed
Date Thu, 01 Feb 2018 19:45:36 GMT
What do you mean by "conflicted PR"?  Do you mean disagreements in what can
go in or out?  If so, I believe that Apache addresses this.

Per Apache:

"A code-modification proposal may be stopped dead in its tracks by a -1
vote by a qualified voter. This constitutes a veto, and it cannot be
overruled nor overridden by anyone. Vetos stand until and unless withdrawn
by their casters."



On Thu, Feb 1, 2018 at 11:30 AM, Mu Li <muli.cmu@gmail.com> wrote:

> Hi Kellen,
>
> The CPP tests PR totally make sense to me. What I really want to figure out
> is the process that reaching an agreement before merging conflicted PRs.
>
> Best
> Mu
>
> On Thu, Feb 1, 2018 at 11:25 AM, kellen sunderland <
> kellen.sunderland@gmail.com> wrote:
>
> > Hey Mu + Da, sorry to hear that.  I've been working on the CPP tests PR
> and
> > iterating on it for quite a while.  I can assure you getting it merged
> had
> > nothing to do with this revert from my perceptive.  It's actually a task
> > assigned to me for Q1 internally at Amazon.
> >
> > I'm actually completely unfamiliar with what's going regarding the
> revert,
> > but I'm quite looking forward to this PR getting merged.  It seemed to
> > cleanup the code a lot, and I known Asmus and I will be looking closely
> at
> > the perf numbers.  (MKL in the past has given us huge speed boosts on a
> > service we're optimizing).
> >
> > -Kellen
> >
> > On Thu, Feb 1, 2018 at 8:11 PM, Mu Li <muli.cmu@gmail.com> wrote:
> >
> > > I think simply reverting the MKL PR that Da has been working for a half
> > > year and immediately enabling the cpp tests in the CI without reaching
> an
> > > agreement with Da seriously hurts his feelings.
> > >
> > > I put the details at the end of
> > > https://github.com/apache/incubator-mxnet/pull/9661
> > >
> > >
> > > On Thu, Feb 1, 2018 at 10:42 AM, Chris Olivier <cjolivier01@gmail.com>
> > > wrote:
> > >
> > > > I have created starter changes for converting to the CoreOpExecutor.
> > This
> > > > code compiles the first test.
> > > >
> > > > See notes in the PR description:
> > > >
> > > > https://github.com/cjolivier01/mxnet/pull/2
> > > >
> > > > On Wed, Jan 31, 2018 at 9:47 PM, Zhao, Patric <patric.zhao@intel.com
> >
> > > > wrote:
> > > >
> > > > > Thanks, Chris, I agree the quality is the most important thing.
> > > > >
> > > > > We will try to enable these cases ASAP.
> > > > >
> > > > > ---Patric
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Chris Olivier [mailto:cjolivier01@gmail.com]
> > > > > > Sent: Thursday, February 1, 2018 1:04 PM
> > > > > > To: dev@mxnet.incubator.apache.org
> > > > > > Subject: Re: Unit tests removed
> > > > > >
> > > > > > C++ Unit tests test correctness of output as well as consistency
> of
> > > > *all
> > > > > > inputs and outputs* between different implementations (mkl,
cpu,
> > gpu,
> > > > > > cudnn, etc.)
> > > > > >
> > > > > > C++ Unit tests also test that varying channel axes produce the
> > > expected
> > > > > > numeric distributions.
> > > > > >
> > > > > > In addition, *batch norm tests in test_operator.py are disabled
> due
> > > to
> > > > CI
> > > > > > issues*, so this leaves batch norm largely untested (even though
> > > those
> > > > > tests
> > > > > > only tested numeric gradient and nothing else -- they were
> existing
> > > > tests
> > > > > > before I re-factored batch norm and they were weak to begin
> with).
> > > > > >
> > > > > > C++ unit tests also measure performance differences between
the
> > > various
> > > > > > versions (cpu, mkl, gpu, cudnn, etc).
> > > > > > I've made several comments over the course of this mkl-dnn
> project
> > > that
> > > > > the
> > > > > > unit tests need to be fixed to adjust to the refactor, so this
> > > > shouldn't
> > > > > br a
> > > > > > surprise to anyone.
> > > > > >
> > > > > > Hacking out unit tests when it's inconvenient to fix them because
> > of
> > > > your
> > > > > > code changes sets a *dangerous precedent*. What would the rule
> be?
> > > Is
> > > > it
> > > > > > "don't remove tests unless your PR is really big and it would
> take
> > > you
> > > > a
> > > > > > couple of days to fix them"?  We aren't talking about 1-2 tests,
> > > either
> > > > > --
> > > > > > we're talking about a LOT of tests that test various aspects
of
> the
> > > > > operators.
> > > > > >
> > > > > > As for Intel, they are free to use the PR branch, and even if
> they
> > > > > can't, I don't
> > > > > > think that (or anything else) justifies compromising quality.
> > > > > >
> > > > > >
> > > > > > On Wed, Jan 31, 2018 at 8:45 PM, Haibin Lin <
> > > haibin.lin.aws@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Good catch.
> > > > > > >
> > > > > > > In general, I agree that tests are not supposed to removed,
> > > although
> > > > > > > CI is not running any of these cpp tests just yet. Usually
unit
> > > tests
> > > > > > > in python for individual operators should be sufficient
to test
> > the
> > > > > > > correctness of operators (although I don't know how/if
python
> > tests
> > > > > > > can run on edge devices like iOS/Android) and it's not
feasible
> > to
> > > > > > > duplicate all operator python tests with their cpp
> counterparts.
> > > For
> > > > > > > functionalities that are not exposed to python or memory
> checks,
> > I
> > > do
> > > > > > > agree that cpp tests are necessary.
> > > > > > >
> > > > > > > I am curious what are exactly tested in those cpp tests?
I know
> > > that
> > > > > > > BatchNorm ops are tested in python already, but I don't
have a
> > full
> > > > > > > picture of all cpp tests and whether we want to re-implement
> > them.
> > > > > > >
> > > > > > > On the other hand, be aware that the Intel team have a
few
> > MKL-DNN
> > > > > > > related projects depending on this PR. Reverting the PR
will
> be a
> > > > > blocker for
> > > > > > them.
> > > > > > >
> > > > > > > Best,
> > > > > > > Haibin
> > > > > > >
> > > > > > > On Wed, Jan 31, 2018 at 3:30 PM, Marco de Abreu <
> > > > > > > marco.g.abreu@googlemail.com> wrote:
> > > > > > >
> > > > > > > > Here's a link:
> > > > > > > > https://github.com/apache/incubator-mxnet/pull/8302#
> > > > > > > discussion_r165204667
> > > > > > > >
> > > > > > > > -Marco
> > > > > > > >
> > > > > > > > 2018-01-31 15:23 GMT-08:00 Marco de Abreu
> > > > > > > > <marco.g.abreu@googlemail.com>
> > > > > > > :
> > > > > > > >
> > > > > > > > > Hi Chris,
> > > > > > > > >
> > > > > > > > > considering the size of that PR, could you provide
a direct
> > > link
> > > > > > > > > to the changes you're addressing?
> > > > > > > > >
> > > > > > > > > -Marco
> > > > > > > > >
> > > > > > > > > 2018-01-31 13:39 GMT-08:00 Chris Olivier <
> > > cjolivier01@gmail.com
> > > > >:
> > > > > > > > >
> > > > > > > > >> This PR was just merged that removed some
30 or so C++
> unit
> > > > tests
> > > > > > > > >> for batch norm operator.
> > > > > > > > >>
> > > > > > > > >> https://github.com/apache/incubator-mxnet/pull/8302
> > > > > > > > >>
> > > > > > > > >> Is this ok?
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message