mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@googlemail.com.INVALID>
Subject Re: MKLDNN performance in CI
Date Fri, 23 Nov 2018 02:30:58 GMT
Okay, that makes sense. Thanks for the very detailed explanation!

Don't worry about speed in CI, especially the debug flag might be the
reason here. But I'm happy to leave it in since the task of CI is accuracy
and detailed reporting, but not speed :)

I just wanted to give a quick heads up in case this could be an indicator.
I'm looking forward to the 1:1 comparisons, but besides that there's no
further action required. Thanks for your efforts!

Best regards,
Marco

Am Fr., 23. Nov. 2018, 03:25 hat Lv, Tao A <tao.a.lv@intel.com> geschrieben:

> I think yes, except the cpp test.
>
> -----Original Message-----
> From: Zhao, Patric [mailto:patric.zhao@intel.com]
> Sent: Friday, November 23, 2018 10:06 AM
> To: dev@mxnet.incubator.apache.org
> Subject: RE: MKLDNN performance in CI
>
> Good point, Tao!
> Is this env enabled in all MKL-DNN CI?
>
> > -----Original Message-----
> > From: Lv, Tao A [mailto:tao.a.lv@intel.com]
> > Sent: Friday, November 23, 2018 9:53 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: RE: MKLDNN performance in CI
> >
> > Thanks for bringing this up, Marco. It's really weird since most of
> > those tests listed in "worth noting" are not related to mkldnn backend.
> >
> > I can understand that some tests for mkldnn operator may be slower
> > because MXNET_MKLDNN_DEBUG is enabled in the CI:
> > https://github.com/apache/incubator-
> > mxnet/blob/master/ci/docker/runtime_functions.sh#L713
> >
> > -----Original Message-----
> > From: Marco de Abreu [mailto:marco.g.abreu@googlemail.com.INVALID]
> > Sent: Friday, November 23, 2018 9:22 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: MKLDNN performance in CI
> >
> > Hello,
> >
> > I have noticed that our Python tests have been increasing in duration
> recently.
> > In order to analyse this further, I created the PR [1] which allows to
> > record test durations. Please note that I did not dive deep on these
> > numbers and that they have to be taken with a grain of salt since
> > slaves have varying resource utilizations.
> >
> > Please have a look at the two following logs:
> > Python3 CPU MKLDNN:
> > http://jenkins.mxnet-ci.amazon-
> > ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-
> > validation/pipelines/unix-cpu/branches/PR-
> > 13377/runs/2/nodes/155/steps/409/log/?start=0
> > Python3 CPU Openblas:
> > http://jenkins.mxnet-ci.amazon-
> > ml.com/blue/rest/organizations/jenkins/pipelines/mxnet-
> > validation/pipelines/unix-cpu/branches/PR-
> > 13377/runs/2/nodes/152/steps/398/log/?start=0
> >
> > If you scroll to the end (note that there are multiple test stages and
> > summaries being printed in these logs), you will find the following
> > statements:
> >
> > Python3 CPU MKLDNN: "Ran 702 tests in 3042.102s"
> > Python3 CPU Openblas: "Ran 702 tests in 2158.458s"
> >
> > This shows that the MKLDNN is generally being about 40% slower than
> > the Openblas backend. If we go into the details, we can see that some
> > tests are significantly slower:
> >
> > Python3 CPU MKLDNN:
> >
> > >[success] 20.78% test_random.test_shuffle: 630.7165s [success] 17.79%
> > >test_sparse_operator.test_elemwise_binary_ops: 540.0487s [success]
> > >10.91% test_gluon_model_zoo.test_models: 331.1503s [success] 2.62%
> > >test_operator.test_broadcast_binary_op: 79.4556s [success] 2.45%
> > >test_operator.test_pick: 74.4041s [success] 2.39%
> > >test_metric_perf.test_metric_performance: 72.5445s [success] 2.38%
> > >test_random.test_negative_binomial_generator: 72.1751s [success]
> > >1.84%
> > >test_operator.test_psroipooling: 55.9432s [success] 1.78%
> > >test_random.test_poisson_generator: 54.0104s [success] 1.72%
> > >test_gluon.test_slice_pooling2d_slice_pooling2d: 52.3447s [success]
> > >1.60% test_contrib_control_flow.test_cond: 48.6977s [success] 1.41%
> > >test_random.test_random: 42.8712s [success] 1.03%
> > >test_operator.test_layer_norm: 31.1242s
> >
> >
> > Python3 CPU Openblas:
> > > [success] 26.20% test_gluon_model_zoo.test_models: 563.3366s
> > > [success] 4.34% test_random.test_shuffle: 93.3157s [success] 4.31%
> > > test_random.test_negative_binomial_generator: 92.6899s [success]
> > > 3.78%
> > > test_sparse_operator.test_elemwise_binary_ops: 81.2048s  [success]
> > > 3.30% test_operator.test_psroipooling: 70.9090s  [success] 3.20%
> > > test_random.test_poisson_generator: 68.7500s  [success] 3.10%
> > > test_metric_perf.test_metric_performance: 66.6085s  [success] 2.79%
> > > test_operator.test_layer_norm: 59.9566s  [success] 2.66%
> > > test_gluon.test_slice_pooling2d_slice_pooling2d: 57.1887s  [success]
> > > 2.62% test_operator.test_pick: 56.2312s  [success] 2.60%
> > > test_random.test_random: 55.8920s  [success] 2.19%
> > > test_operator.test_broadcast_binary_op: 47.1879s [success] 0.96%
> > > test_contrib_control_flow.test_cond: 20.6908s
> >
> > Tests worth noting:
> > - test_random.test_shuffle: 700% increase - but I don't know how this
> > may be related to MKLDNN. Are we doing random number generation in
> > either of those backends?
> > - test_sparse_operator.test_elemwise_binary_ops: 700% increase
> > - test_gluon_model_zoo.test_models: 40% decrease - that's awesome and
> > to be expect :)
> > - test_operator.test_broadcast_binary_op: 80% increase
> > - test_contrib_control_flow.test_cond: 250% increase
> > - test_operator.test_layer_norm: 50% decrease - nice!
> >
> > As I have stated previously, these numbers might not mean anything
> > since the CI is not a benchmarking environment (sorry if these are
> > false negatives), but I thought it might be worth mentioning so Intel
> > could follow up and dive deeper.
> >
> > Does anybody here create 1:1 operator comparisons (e.g. running
> > layer_norm in the different backends to compare the performance) who
> > could provide us with those numbers?
> >
> > Best regards,
> > Marco
> >
> > [1]: https://github.com/apache/incubator-mxnet/pull/13377
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message