mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lin Yuan <apefor...@gmail.com>
Subject Re: [Announce] Upcoming Apache MXNet (incubating) 1.3.1 patch release
Date Tue, 06 Nov 2018 23:28:52 GMT
Hi Anton,

Thanks for helping the release.
The following PRs are needed by customers who want to use deterministic
CUDNN convolution algorithms:

https://github.com/apache/incubator-mxnet/pull/12992
https://github.com/apache/incubator-mxnet/pull/13049

Thanks!

Lin


On Tue, Nov 6, 2018 at 1:51 PM Aaron Markham <aaron.s.markham@gmail.com>
wrote:

> Hi Anton,
> I have the following suggestions for fixes to include in 1.3.1. These each
> have updates to files that will impact docs generation for the 1.3.x
> version of the website's Python API docs:
>
> https://github.com/apache/incubator-mxnet/pull/12879
> https://github.com/apache/incubator-mxnet/pull/12871
> https://github.com/apache/incubator-mxnet/pull/12856
>
> Thanks,
> Aaron
>
> On Tue, Nov 6, 2018 at 1:29 PM Lai Wei <royweilai@gmail.com> wrote:
>
> > Hi Anton,
> >
> > Thanks for driving this, I would like to include the following fix in
> > 1.3.1:
> > Allow infer shape partial on foreach operator:
> > https://github.com/apache/incubator-mxnet/pull/12471
> >
> > Keras-MXNet needs this functionality to infer shape partially
> > on foreach operator. (Used in RNN operators)
> >
> > Thanks a lot!
> >
> >
> > Best Regards
> > Lai Wei
> >
> >
> >
> > On Tue, Nov 6, 2018 at 10:44 AM Haibin Lin <haibin.lin.aws@gmail.com>
> > wrote:
> >
> > > Hi Naveen and Anton,
> > >
> > > Thanks for pointing that out. You are right that these are not critical
> > > fixes. Putting them in 1.4.0 is more appropriate. PRs are closed.
> > >
> > > Best,
> > > Haibin
> > >
> > > On Tue, Nov 6, 2018 at 7:35 AM Naveen Swamy <mnnaveen@gmail.com>
> wrote:
> > >
> > > > Please note that this is a patch release(1.3.1) to address critical
> > > bugs!,
> > > > For everything else please wait for 1.4.0 which is planned very
> shortly
> > > > after 1.3.1
> > > >
> > > > > On Nov 6, 2018, at 7:17 AM, Anton Chernov <mechernov@gmail.com>
> > wrote:
> > > > >
> > > > > The following PR's have been created so far:
> > > > >
> > > > > Infer dtype in SymbolBlock import from input symbol (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13117
> > > > >
> > > > > [MXNET-953] Fix oob memory read (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13118
> > > > >
> > > > > [MXNET-969] Fix buffer overflow in RNNOp (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13119
> > > > >
> > > > > [MXNET-922] Fix memleak in profiler (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13120
> > > > >
> > > > > Set correct update on kvstore flag in dist_device_sync mode
> (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13121
> > > > >
> > > > > update mshadow (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13122
> > > > >
> > > > > CudnnFind() usage improvements (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13123
> > > > >
> > > > > Fix lazy record io when used with dataloader and multi_worker >
0
> > > > (v1.3.x)
> > > > > https://github.com/apache/incubator-mxnet/pull/13124
> > > > >
> > > > >
> > > > > As stated previously I would be rather opposed to have following
> PR's
> > > it
> > > > in
> > > > > the patch release:
> > > > >
> > > > > Gluon LSTM Projection and Clipping Support (#13055) v1.3.x
> > > > > https://github.com/apache/incubator-mxnet/pull/13129
> > > > >
> > > > > sample_like operators (#13034) v1.3.x
> > > > > https://github.com/apache/incubator-mxnet/pull/13130
> > > > >
> > > > >
> > > > > Best
> > > > > Anton
> > > > >
> > > > > вт, 6 нояб. 2018 г. в 16:06, Anton Chernov <mechernov@gmail.com>:
> > > > >
> > > > >> Hi Haibin,
> > > > >>
> > > > >> I have a few comments regarding the proposed performance
> improvement
> > > > >> changes.
> > > > >>
> > > > >> CUDNN support for LSTM with projection & clipping
> > > > >> https://github.com/apache/incubator-mxnet/pull/13056
> > > > >>
> > > > >> There is no doubt that this change brings value, but I don't
see
> it
> > > as a
> > > > >> critical bug fix. I would rather leave it for the next major
> > release.
> > > > >>
> > > > >> sample_like operators
> > > > >> https://github.com/apache/incubator-mxnet/pull/13034
> > > > >>
> > > > >> Even if it's related to performance, this is an addition of
> > > > functionality
> > > > >> and I would also push this to be in the next major release only.
> > > > >>
> > > > >>
> > > > >> Best
> > > > >> Anton
> > > > >>
> > > > >>
> > > > >> вт, 6 нояб. 2018 г. в 15:55, Anton Chernov <mechernov@gmail.com>:
> > > > >>
> > > > >>> Hi Patric,
> > > > >>>
> > > > >>> This change was listed in the 'PR candidates suggested for
> > > > consideration
> > > > >>> for v1.3.1 patch release' section [1].
> > > > >>>
> > > > >>> You are right, I also think that this is not a critical hotfix
> > change
> > > > >>> that should be included into the 1.3.1 patch release.
> > > > >>>
> > > > >>> Thus I'm not making any further efforts to bring it in.
> > > > >>>
> > > > >>> Best
> > > > >>> Anton
> > > > >>>
> > > > >>> [1]
> > > > >>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release#PR_candidates
> > > > >>>
> > > > >>>
> > > > >>> вт, 6 нояб. 2018 г. в 1:14, Zhao, Patric <patric.zhao@intel.com
> >:
> > > > >>>
> > > > >>>> Hi Anton,
> > > > >>>>
> > > > >>>> Thanks for looking into the MKL-DNN PR.
> > > > >>>>
> > > > >>>> As my understanding of cwiki (
> > > > >>>>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+for+next+MXNet+Release
> > > > >>>> ),
> > > > >>>> these features will go into 1.4 rather than patch release
of
> > 1.3.1.
> > > > >>>>
> > > > >>>> Feel free to correct me :)
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>>
> > > > >>>> --Patric
> > > > >>>>
> > > > >>>>> -----Original Message-----
> > > > >>>>> From: Anton Chernov [mailto:mechernov@gmail.com]
> > > > >>>>> Sent: Tuesday, November 6, 2018 3:11 AM
> > > > >>>>> To: dev@mxnet.apache.org
> > > > >>>>> Subject: Re: [Announce] Upcoming Apache MXNet (incubating)
> 1.3.1
> > > > patch
> > > > >>>>> release
> > > > >>>>>
> > > > >>>>> It seems that there is a problem porting following
changes to
> the
> > > > >>>> v1.3.x
> > > > >>>>> release branch:
> > > > >>>>>
> > > > >>>>> Implement mkldnn convolution fusion and quantization
> > > > >>>>> https://github.com/apache/incubator-mxnet/pull/12530
> > > > >>>>>
> > > > >>>>> MKL-DNN Quantization Examples and README
> > > > >>>>> https://github.com/apache/incubator-mxnet/pull/12808
> > > > >>>>>
> > > > >>>>> The bases are different.
> > > > >>>>>
> > > > >>>>> I would need help from authors of these changes to
make a
> > backport
> > > > PR.
> > > > >>>>>
> > > > >>>>> @ZhennanQin, @xinyu-intel would you be able to assist
me and
> > create
> > > > the
> > > > >>>>> corresponding PR's?
> > > > >>>>>
> > > > >>>>> Without proper history and domain knowledge I would
not be able
> > to
> > > > >>>> create
> > > > >>>>> them by my own in reasonable amount of time, I'm
afraid.
> > > > >>>>>
> > > > >>>>> Best regards,
> > > > >>>>> Anton
> > > > >>>>>
> > > > >>>>> пн, 5 нояб. 2018 г. в 19:45, Anton Chernov
<
> mechernov@gmail.com
> > >:
> > > > >>>>>
> > > > >>>>>>
> > > > >>>>>> As part of:
> > > > >>>>>>
> > > > >>>>>> Implement mkldnn convolution fusion and quantization
> > > > >>>>>> https://github.com/apache/incubator-mxnet/pull/12530
> > > > >>>>>>
> > > > >>>>>> I propose to add the examples and documentation
PR as well:
> > > > >>>>>>
> > > > >>>>>> MKL-DNN Quantization Examples and README
> > > > >>>>>> https://github.com/apache/incubator-mxnet/pull/12808
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Best regards,
> > > > >>>>>> Anton
> > > > >>>>>>
> > > > >>>>>> пн, 5 нояб. 2018 г. в 19:02, Anton Chernov
<
> mechernov@gmail.com
> > >:
> > > > >>>>>>
> > > > >>>>>>> Dear MXNet community,
> > > > >>>>>>>
> > > > >>>>>>> I will be the release manager for the upcoming
1.3.1 patch
> > > release.
> > > > >>>>>>> Naveen will be co-managing the release and
providing help
> from
> > > the
> > > > >>>>>>> committers side.
> > > > >>>>>>>
> > > > >>>>>>> The following dates have been set:
> > > > >>>>>>>
> > > > >>>>>>> Code Freeze: 31st October 2018
> > > > >>>>>>> Release published: 13th November 2018
> > > > >>>>>>>
> > > > >>>>>>> Release notes have been drafted here [1].
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> * Known issues
> > > > >>>>>>>
> > > > >>>>>>> Update MKL-DNN dependency
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12953
> > > > >>>>>>>
> > > > >>>>>>> This PR hasn't been merged even to master
yet. Requires
> > > additional
> > > > >>>>>>> discussion and merge.
> > > > >>>>>>>
> > > > >>>>>>> distributed kvstore bug in MXNet
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/issues/12713
> > > > >>>>>>>
> > > > >>>>>>>> When distributed kvstore is used, by
default gluon.Trainer
> > > doesn't
> > > > >>>>>>>> work
> > > > >>>>>>> with mx.optimizer.LRScheduler if a worker
has more than 1
> GPU.
> > To
> > > > be
> > > > >>>>>>> more specific, the trainer updates once per
GPU, the
> > LRScheduler
> > > > >>>>>>> object is shared across GPUs and get a wrong
update count.
> > > > >>>>>>>
> > > > >>>>>>> This needs to be fixed. [6]
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> * Changes
> > > > >>>>>>>
> > > > >>>>>>> The following changes will be ported to the
release branch,
> per
> > > > [2]:
> > > > >>>>>>>
> > > > >>>>>>> Infer dtype in SymbolBlock import from input
symbol [3]
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12412
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-953] Fix oob memory read
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12631
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-969] Fix buffer overflow in RNNOp
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12603
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-922] Fix memleak in profiler
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12499
> > > > >>>>>>>
> > > > >>>>>>> Implement mkldnn convolution fusion and quantization
(MXNet
> > Graph
> > > > >>>>>>> Optimization and Quantization based on subgraph
and MKL-DNN
> > > > >>>>> proposal
> > > > >>>>>>> [4])
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12530
> > > > >>>>>>>
> > > > >>>>>>> Following items (test cases) should be already
part of 1.3.0:
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-486] Create CPP test for concat MKLDNN
operator
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/11371
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-489] MKLDNN Pool test
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/11608
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-484] MKLDNN C++ test for LRN operator
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/11831
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-546] Add unit test for MKLDNNSum
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/11272
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-498] Test MKLDNN backward operators
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/11232
> > > > >>>>>>>
> > > > >>>>>>> [MXNET-500] Test cases improvement for MKLDNN
on Gluon
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/10921
> > > > >>>>>>>
> > > > >>>>>>> Set correct update on kvstore flag in dist_device_sync
mode
> (as
> > > > part
> > > > >>>>>>> of fixing [5])
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12786
> > > > >>>>>>>
> > > > >>>>>>> upgrade mshadow version
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12692
> > > > >>>>>>> But another PR will be used instead:
> > > > >>>>>>> update mshadow
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12674
> > > > >>>>>>>
> > > > >>>>>>> CudnnFind() usage improvements
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12804
> > > > >>>>>>> A critical CUDNN fix that reduces GPU memory
consumption and
> > > > >>>>>>> addresses this memory leak issue. This is
an important fix to
> > > > >>>> include
> > > > >>>>>>> in 1.3.1
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> From discussion about gluon toolkits:
> > > > >>>>>>>
> > > > >>>>>>> disable opencv threading for forked process
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12025
> > > > >>>>>>>
> > > > >>>>>>> Fix lazy record io when used with dataloader
and multi_worker
> > > 0
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12554
> > > > >>>>>>>
> > > > >>>>>>> fix potential floating number overflow, enable
float16
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/pull/12118
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> * Resolved issues
> > > > >>>>>>>
> > > > >>>>>>> MxNet 1.2.1–module get_outputs()
> > > > >>>>>>>
> https://discuss.mxnet.io/t/mxnet-1-2-1-module-get-outputs/1882
> > > > >>>>>>>
> > > > >>>>>>> As far as I can see from the comments the
issue has been
> > > resolved,
> > > > >>>> no
> > > > >>>>>>> actions need to be taken for this release.
[7] is mentioned
> in
> > > this
> > > > >>>>>>> regards, but I don't see any action points
here either.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> I will start with help of Naveen port the
mentioned PR's to
> the
> > > > >>>> 1.3.x
> > > > >>>>>>> branch.
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Best regards,
> > > > >>>>>>> Anton
> > > > >>>>>>>
> > > > >>>>>>> [1] https://cwiki.apache.org/confluence/x/eZGzBQ
> > > > >>>>>>> [2]
> > > > >>>>>>>
> > > > >>>>
> > > https://cwiki.apache.org/confluence/display/MXNET/Project+Proposals+f
> > > > >>>>>>> or+next+MXNet+Release [3]
> > > > >>>>>>> https://github.com/apache/incubator-mxnet/issues/11849
> > > > >>>>>>> [4]
> > > > >>>>>>>
> > > > >>>>>
> > > >
> https://cwiki.apache.org/confluence/display/MXNET/MXNet+Graph+Optimiz
> > > > >>>>>>> ation+and+Quantization+based+on+subgraph+and+MKL-DNN
> > > > >>>>>>> [5] https://github.com/apache/incubator-mxnet/issues/12713
> > > > >>>>>>> [6]
> > > > >>>>>>> https://github.com/apache/incubator-
> > > > >>>>> mxnet/issues/12713#issuecomment-4
> > > > >>>>>>> 35773777 [7]
> > > https://github.com/apache/incubator-mxnet/pull/11005
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>
> > > > >>>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message