mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lausen, Leonard" <lau...@amazon.com.INVALID>
Subject Re: Proposal to make MKLDNN as default CPU backend
Date Mon, 10 Feb 2020 18:00:41 GMT
Hi,

as the respective PR has been open for a while and as there has been no follow-
up to Patric's mail, I suggest to merge it once CI passes after Tao's conflict
resolution earlier today.

This gives community members time to test for regressions prior to the 1.7
release. If such were found, we can reconsider the decision.

Best regards
Leonard

[1]: https://github.com/apache/incubator-mxnet/pull/16899

On Wed, 2019-11-20 at 05:27 +0000, Zhao, Patric wrote:
> Thanks all of the great suggestions. 
> 
> Regarding the binary release, including w/o MKLDNN build, I have summarized a
> table (check attachment).
> 
> - Major changes in python packages, see attached table. 
> - Switch on MKLDNN for no mkl suffix binary in release 1.7 (Red check mark) 
> - Add new mxnet-native build w/o MKLDNN and cuDNN (Yellow background)
>   Track the usage/download in 1-2 releases and then decide if we need it for a
> long time
> - Drop all mkl suffix binary in next major release v2.x.
> 
> Thanks,
> 
> --Patric
> 
> > -----Original Message-----
> > From: Lin Yuan <apeforest@gmail.com>
> > Sent: Wednesday, November 20, 2019 5:40 AM
> > To: dev@mxnet.incubator.apache.org
> > Cc: Tao Lv <mutouorz@gmail.com>
> > Subject: Re: Proposal to make MKLDNN as default CPU backend
> > 
> > Also per Sam's suggestion, we could still release a build without MKLDNN
> > (name it mxnet-nomkldnn?) and track the usage/download for one or two
> > releases. If there is no usage, we could drop that build in the future.
> > 
> > Best,
> > 
> > Lin
> > 
> > On Tue, Nov 19, 2019 at 1:23 PM Lin Yuan <apeforest@gmail.com> wrote:
> > 
> > > Just to summarize base on the concerns Marco raised and discussed
> > abvove:
> > > - AMD CPU (it should work with MKLDNN:
> > > 
> > https://cwiki.apache.org/confluence/display/MXNET/MXNet+with+Intel+M
> > KL
> > > -DNN+-+Performance+Benchmarking
> > > )
> > > - ARM CPU (we don't have it today w/o MKLDNN either)
> > > - Windows (Windows support is there regardless of MKLDNN or not)
> > > - GPU and MKLDNN enabled (already supported)
> > > - Fully reproducible results (medical and financial sector requested
> > > that and we have some flags for cuda) (The nondeterminism exists even
> > > today w/o MKLDNN. We should address it regardless of MLKDNN)
> > > 
> > > Marco, please let us know if your concerns are properly addressed?
> > > 
> > > Given that MKLDNN gives significant performance speed up in CPU, I am
> > > inclined to make it default in pip build.
> > > 
> > > Best,
> > > 
> > > Lin
> > > 
> > > On Tue, Nov 19, 2019 at 8:08 AM Chris Olivier <cjolivier01@gmail.com>
> > > wrote:
> > > 
> > > > Thanks, Patric. I was just trying to point out that there was
> > > > currently no guarantee of deterministic results without MKL, so
> > > > there’s not necessarily an expectation of determinism with MKL (ie
> > requirement isn’t relaxed).
> > > > On Mon, Nov 18, 2019 at 9:38 PM Zhao, Patric <patric.zhao@intel.com>
> > > > wrote:
> > > > 
> > > > > It may be a concern but little noise can't affect the final results
> > > > > if
> > > > the
> > > > > algorithm is stable in numerical.
> > > > > The MKLDNN backend with mxnet-mkl has been used for 2 years and
> > we
> > > > didn't
> > > > > see the coverage issue caused by multiple threading.
> > > > > In other words, GPU programming mode works well on training where
> > > > > the non-deterministic also exists from multiple threads.
> > > > > 
> > > > > Parts of training accuracy was pasted in the first PR when MKLDNN
> > > > > is integrated.
> > > > > 
> > > > https://github.com/apache/incubator-mxnet/pull/8302#issuecomment-
> > 3596
> > > > 74818
> > > > > In conclusion, it may happen with very little probability. I
> > > > > believe we can get a solution in case it happens someday.
> > > > > 
> > > > > Thanks,
> > > > > 
> > > > > --Patric
> > > > > 
> > > > > 
> > > > > > -----Original Message-----
> > > > > > From: Chris Olivier <cjolivier01@gmail.com>
> > > > > > Sent: Tuesday, November 19, 2019 11:51 AM
> > > > > > To: dev@mxnet.incubator.apache.org
> > > > > > Cc: Tao Lv <mutouorz@gmail.com>
> > > > > > Subject: Re: Proposal to make MKLDNN as default CPU backend
> > > > > > 
> > > > > > (for non mkl dropout, for instance)
> > > > > > 
> > > > > > On Mon, Nov 18, 2019 at 7:50 PM Chris Olivier
> > > > > > <cjolivier01@gmail.com>
> > > > > > wrote:
> > > > > > 
> > > > > > > To address the deterministic item, I know for a fact that
> > > > > > > training will not be deterministic in some cases where
the
> > > > > > > “parallel
> > random”
> > > > > > > class is utilized in parallel threads, such as OMP, if
the
> > > > > > > number of cores is different, even with the same seed,
because
> > > > > > > threads are seeded independently and different number of
> > > > > > > threads will end up generating different random number
> > > > > > > sequences. Dropout operator being
> > > > > > an example.
> > > > > > > On Mon, Nov 18, 2019 at 6:39 PM Alfredo Luque
> > > > > > > <alfredo.luque@airbnb.com.invalid> wrote:
> > > > > > > 
> > > > > > > > For AMD CPUs, you’d want to perform validation because
now
> > > > > > > > MKL-DNN would be enabled by default. Historically,
other intel
> > > > > > > > libraries (along with the ICC
> > > > > > > > compiler) have had performance issues on AMD CPUs.
It’s just
> > > > > > > > worth double checking to make sure that’s not the
case here.
> > > > > > > > Perhaps some MKL-DNN authors can chime in though.
It’s not
> > > > > > > > sufficient to double check that an
> > > > > > > > AVX2 package passes tests.
> > > > > > > > 
> > > > > > > > Agreed in the case we’re not releasing ARM binaries.
> > > > > > > > 
> > > > > > > > The reproducibility argument is around the results
being
> > > > numerically
> > > > > > > > reproducible. That is, eg; if I train a model with
some fixed
> > > > > > > > set
> > > > of
> > > > > > > > data, some random seed, etc. and then run inference
on it do I
> > > > > > > > get the exact same floating point values for the weights
and
> > > > > > > > results?
> > > > > > > > Does MxNet already offer this without MKL-DNN?
> > > > > > > > 
> > > > > > > > On November 18, 2019 at 6:32:07 PM, Tao Lv
> > > > > > > > (mutouorz@gmail.com)
> > > > > > wrote:
> > > > > > > > Regarding the cases listed by Marco:
> > > > > > > > - AMD CPU
> > > > > > > > From my architecture knowledge, what works on C4 instances
> > > > > > > > (with
> > > > AVX2
> > > > > > > > support) should also work well on m5a, right? I think
> > > > > > > > mxnet-mkl and mxnet-cuxxmkl packages have been fully
validated
> > on AVX2 machines.
> > > > > > > > Also, we didn't perform any validation on AMD CPU
before, why
> > > > > > > > we
> > > > need
> > > > > > > > do that for this time?
> > > > > > > > 
> > > > > > > > - ARM CPU
> > > > > > > > I don't know we're releasing any convenience binaries
for ARM
> > CPU.
> > > > > > > > This proposal mainly targets those pypi packages.
> > > > > > > > 
> > > > > > > > - Windows
> > > > > > > > Already validated by CI. We're also releasing mxnet-mkl
> > > > > > > > packages
> > > > for
> > > > > Win.
> > > > > > > > - GPU and MKLDNN enabled
> > > > > > > > Already validated by CI and mxnet-cuxxmkl packages
have been
> > > > released
> > > > > > > > for several versions.
> > > > > > > > 
> > > > > > > > - Fully reproducible results (medical and financial
sector
> > > > requested
> > > > > > > > that and we have some flags for cuda) Not sure I understand
> > > > > > > > this case. We already have MKL-DNN backend for a while.
> > > > > > > > Functionality
> > > > and
> > > > > > > > correctness of it have been verified by MXNet users.
> > > > > > > > 
> > > > > > > > -tao
> > > > > > > > 
> > > > > > > > On Tue, Nov 19, 2019 at 4:41 AM Marco de Abreu
> > > > > > > > <marco.g.abreu@gmail.com>
> > > > > > > > wrote:
> > > > > > > > 
> > > > > > > > > Sorry, my intent with the "non-standard" phrase
was not
> > > > > > > > > about general
> > > > > > > > MXNet
> > > > > > > > > but rather from MKLDNNs point of view, considering
that it's
> > > > being
> > > > > > > > > developed by Intel, I assumed that MKLDNN might
consider
> > > > non-intel
> > > > > > > > > use-cases non standard.
> > > > > > > > > 
> > > > > > > > > -Marco
> > > > > > > > > 
> > > > > > > > > Skalicky, Sam <sskalic@amazon.com.invalid>
schrieb am Mo., 18.
> > > > Nov.
> > > > > > > > 2019,
> > > > > > > > > 21:34:
> > > > > > > > > 
> > > > > > > > > > Thanks Alfredo, if you can create a GitHub
issue with
> > > > notes/steps
> > > > > > > > > > we
> > > > > > > > can
> > > > > > > > > > add this to the todo list for integrating
with the MXNet
> > > > > > > > > > CI to test on
> > > > > > > > > m5a
> > > > > > > > > > instances too. Then we can start tracking
this on a
> > > > > > > > > > regular basis. It
> > > > > > > > > would
> > > > > > > > > > be great to actually test on ARM instances
now that AWS
> > > > > > > > > > has A1
> > > > > > > > instances
> > > > > > > > > > too…..ill add it to the wish list ;-D
> > > > > > > > > > 
> > > > > > > > > > Sam
> > > > > > > > > > 
> > > > > > > > > > > On Nov 18, 2019, at 12:32 PM, Alfredo
Luque <
> > > > > > > > alfredo.luque@airbnb.com
> > > > > > > > > .INVALID>
> > > > > > > > > > wrote:
> > > > > > > > > > > Happy to run some benchmarks on an
AWS m5a instance
> > > > > > > > > > > (Epyc)
> > > > and
> > > > > > > > > > > first generation AMD Threadripper Gen
1 if someone has
> > > > > > > > > > > something easy to
> > > > > > > > run
> > > > > > > > > > and
> > > > > > > > > > > representative.
> > > > > > > > > > > 
> > > > > > > > > > > On November 18, 2019 at 12:29:31 PM,
Skalicky, Sam (
> > > > > > > > > > > sskalic@amazon.com.invalid) wrote:
> > > > > > > > > > > 
> > > > > > > > > > > Thanks a good idea Alfredo, are you
able to help test on
> > > > > > > > > > > AMD
> > > > > CPUs?
> > > > > > > > Or
> > > > > > > > > is
> > > > > > > > > > > there someone else in the mxnet dev@
community who can
> > help?
> > > > > > > > > > > Sam
> > > > > > > > > > > 
> > > > > > > > > > > > On Nov 18, 2019, at 12:27 PM,
Alfredo Luque
> > > > > > > > > > > <alfredo.luque@airbnb.com.INVALID>
wrote:
> > > > > > > > > > > > Verifying that there isn’t a
slowdown on AMD CPUs (eg;
> > > > Ryzen /
> > > > > > > > Epyc)
> > > > > > > > > > > would
> > > > > > > > > > > > definitely make sense as a requirement.
It seems odd to
> > > > > > > > > > > > classify
> > > > > > > > that
> > > > > > > > > as
> > > > > > > > > > > a
> > > > > > > > > > > > “nonstandard” use case.
> > > > > > > > > > > > 
> > > > > > > > > > > > On November 18, 2019 at 12:20:33
PM, Skalicky, Sam (
> > > > > > > > > > > > sskalic@amazon.com.invalid) wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks Patric & team for your
work over the years to
> > > > > > > > > > > > make MXNet
> > > > > > > > fast
> > > > > > > > > > with
> > > > > > > > > > > > MKLDNN!
> > > > > > > > > > > > 
> > > > > > > > > > > > I think it would be great to make
MKLDNN enabled by
> > default.
> > > > > > > > > > > > We
> > > > > > > > will
> > > > > > > > > > need
> > > > > > > > > > > > to continue producing variants
without MKLDNN for those
> > > > > > > > > > > > who don’t
> > > > > > > > want
> > > > > > > > > > it
> > > > > > > > > > > > (Marco enumerated some use cases).
How do you propose
> > > > > > > > > > > > to identify
> > > > > > > > the
> > > > > > > > > > pip
> > > > > > > > > > > > wheels with/without MKLDNN? Previously
we had:
> > > > > > > > > > > > mxnet-mkl and
> > > > > > > > > > > mxnet-cu101mkl
> > > > > > > > > > > > with MKLDNN. If the plain “mxnet”
pip wheel now
> > > > > > > > > > > > contains MKLDNN
> > > > > > > > what
> > > > > > > > > do
> > > > > > > > > > > you
> > > > > > > > > > > > propose we call the build without
MKLDNN? mxnet-nomkl?
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks!
> > > > > > > > > > > > Sam
> > > > > > > > > > > > 
> > > > > > > > > > > > > On Nov 18, 2019, at 11:08
AM, Marco de Abreu <
> > > > > > > > > marco.g.abreu@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > Hi Patric,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > First of all, thanks a lot
to you and your team for
> > > > > > > > > > > > > all the effort
> > > > > > > > on
> > > > > > > > > > > > MXNet
> > > > > > > > > > > > > and mkldnn!
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Generally I'm inclined towards
your proposal, but I'm
> > > > > > > > > > > > > thinking
> > > > > > > > about
> > > > > > > > > > the
> > > > > > > > > > > > > non-standard use cases:
> > > > > > > > > > > > > - AMD CPU
> > > > > > > > > > > > > - ARM CPU
> > > > > > > > > > > > > - Windows
> > > > > > > > > > > > > - GPU and MKLDNN enabled
> > > > > > > > > > > > > - Fully reproducible results
(medical and financial
> > > > > > > > > > > > > sector
> > > > > > > > requested
> > > > > > > > > > > that
> > > > > > > > > > > > > and we have some flags for
cuda)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Is mkldnn fully compatible
with these use cases? If
> > > > > > > > > > > > > not,
> > > > what
> > > > > > > > would
> > > > > > > > > > > > happen?
> > > > > > > > > > > > > If yes, do we have performance
numbers?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Best regards,
> > > > > > > > > > > > > Marco
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Zhao, Patric <patric.zhao@intel.com>
schrieb am Mo.,
> > > > > > > > > > > > > 18.
> > > > Nov.
> > > > > > > > 2019,
> > > > > > > > > > > > 14:00:
> > > > > > > > > > > > > > Hi MXNet community,
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > From the first MKLDNN
backend integrated in release
> > > > > > > > > > > > > > 1.2,
> > > > the
> > > > > > > > > community
> > > > > > > > > > > > is
> > > > > > > > > > > > > > continuously improving
the quality and performance
> > > > > > > > > > > > > > of
> > > > MKLDNN
> > > > > > > > > > > > > > CPU
> > > > > > > > > > > > backend.
> > > > > > > > > > > > > > Nowadays, the MKLDNN
backend is widely used for the
> > > > > > > > > > > > > > inference,
> > > > > > > > > > > > especially
> > > > > > > > > > > > > > for INT8 inference,
and we got lots of very positive
> > > > > > > > > > > > > > feedbacks
> > > > > > > > from
> > > > > > > > > > > > MXNet
> > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Achieved milestones
as below:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > - MKLDNN integrated
into Apache MXNet from release
> > > > > > > > > > > > > > 1.2,
> > > > Feb,
> > > > > > > > > > > > > > 2018
> > > > > > > > > [1]
> > > > > > > > > > > > > > - MKLDNN backend as
default CPU backend from source
> > > > > > > > > > > > > > building,
> > > > > > > > Jan,
> > > > > > > > > > 2019
> > > > > > > > > > > > [2]
> > > > > > > > > > > > > > - MKLDNN subgraph optimization
as default for the
> > > > inference,
> > > > > > > > > > > > > > Jul,
> > > > > > > > > 2019
> > > > > > > > > > > > [3]
> > > > > > > > > > > > > > - MKLDNN major version
upgrade in release 1.6, Oct,
> > > > > > > > > > > > > > 2019
> > > > [4]
> > > > > > > > > > > > > > To make more successful
and technical leadership for
> > > > Apache
> > > > > > > > > > > > > > MXNet
> > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > industry, I propose
to make MKLDNN as default CPU
> > > > > > > > > > > > > > backend
> > > > in
> > > > > > > > > > > > > > all
> > > > > > > > > > binary
> > > > > > > > > > > > > > distribution from the
next release.
> > > > > > > > > > > > > > The new milestone includes:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > - Static link MKLDNN
library in the binary avoiding
> > > > > > > > > > > > > > the mismatch
> > > > > > > > > > > version
> > > > > > > > > > > > > > in the runtime [5]
> > > > > > > > > > > > > > - Make nightly build
with MKLDNN default from master
> > > > > > > > > > > > > > pre
> > > > 1.7
> > > > > > > > release
> > > > > > > > > > > > > > - Binary distribution
with MKLDNN default from 1.7
> > > > release.
> > > > > > > > > > > > > > What will be changed:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > - mxnet and mxnet-cuXX
binary will be built with
> > > > > > > > > > > > > > MKLDNN=1
> > > > > > > > > > > > > > - mxnet-mkl and mxnet-cuXXmkl
will be not changed in
> > > > > > > > > > > > > > the minor
> > > > > > > > > release
> > > > > > > > > > > > > > (1.x) and plan to remove
in next major release (2.0)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Suggestions and comments
are highly appreciated.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > --Patric
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > https://github.com/apache/incubator-mxnet/pull/9677
> > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > 
> > > > > > https://lists.apache.org/thread.html/bfeae6ee46374112eb4dff1470c2
> > > > > > 6295
> > > > > > > > 9101e4bffb19930926963535@%3Cdev.mxnet.apache.org%3E
> > > > > > > > > > > > > > [3]
> > > > > > > > > > > > > > https://github.com/apache/incubator-mxnet/pull/15518
> > > > > > > > > > > > > > [4]
> > > > > > > > > > > > > > 
> > > > > > https://lists.apache.org/thread.html/f46ab920f18795496eafe713e6e9
> > > > > > e561
> > > > > > > > c684e06189085cec17b401dc@%3Cdev.mxnet.apache.org%3E
> > > > > > > > > > > > > > [5]
> > > > > > > > > > > > > > https://github.com/apache/incubator-mxnet/pull/16731
> > > > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > —
> > > > > > > > > > > > Alfredo Luque
> > > > > > > > > > > > Software Engineer
> > > > > > > > > > > > Machine Learning Infrastructure
Airbnb San Francisco,
> > > > > > > > > > > > CA
> > > > > > > > > > > 
> > > > > > > > > > > —
> > > > > > > > > > > Alfredo Luque
> > > > > > > > > > > Software Engineer
> > > > > > > > > > > Machine Learning Infrastructure Airbnb
San Francisco, CA
> > > > > > > > 
> > > > > > > > —
> > > > > > > > Alfredo Luque
> > > > > > > > Software Engineer
> > > > > > > > Machine Learning Infrastructure Airbnb San Francisco,
CA
> > > > > > > > 
Mime
View raw message