mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xingjian SHI <xsh...@connect.ust.hk>
Subject Re: [VOTE] Release Apache MXNet (incubating) version 1.7.0.rc0
Date Fri, 10 Jul 2020 20:27:13 GMT
Thanks Ziyi,

I've discovered the same issue when I'm trying to use AutoGluon with 1.7.0rc0 and would like
to share my finding:

Basically, I don't think Gluon Block is designed to be pickleble. But pickling do work for
some cases in the old version:

I've included two cases in the gist (https://gist.github.com/sxjscience/944066c82e566f1b89b01fa226678890).

- Case1: we construct a gluon block, hybridize it and feed one NDArray to help initialize
the block. After that, it will no longer be pickleble. 
- Case2: we just construct a gluon block and it will be pickleble in 1.6.0, but won't be pickleble
in 1.7.0.

Thus, the real issue is: Should we supporting pickling a Gluon Block? If not, should we support
combining multiprocessing.pool with the Gluon Block? For reference, PyTorch supports pickling
the nn.Module as shown in: https://gist.github.com/sxjscience/90b812a66d445e759c55eedc3ef93668
and also in the doc (https://pytorch.org/tutorials/beginner/saving_loading_models.html). 

Best,
Xingjian


´╗┐On 7/10/20, 11:31 AM, "Patrick Mu" <zm2263@columbia.edu> wrote:

    Hi Ciyong, 

    I just discovered an issue with the 1.7, which causes the Yolo training with latest Gluon
CV Yolo to fail.

    The PR that causes the failure is https://github.com/apache/incubator-mxnet/pull/18358,
which modifies  basic blocks of Gluon to fix a memory leak issue.

    Talked with Leonard, the author of the PR, and he said he found the root cause, but patching
that PR would modifies those Gluon basic blocks further, which might be risky towards existing
models and various customer models.

    So my 2-cents is reverting this PR in 1.7, and try patching the PR in 1.x and 2.0, meaning
that the 1.7 won't have memory usage optimized by that feature.

    I'd like to hear what you think about this issue.

    Thanks,
    Ziyi


    On 2020/07/10 06:18:02, "Chen, Ciyong" <ciyong.chen@intel.com> wrote: 
    > Hi Community,
    > 
    > I would like to call for action to test/validate/vote for the release candidate (1.7.0.rc0)
    > As there's not any voting result during the scheduled time window, I would like to
extend the time windows to July 13, 23:59:59 PST.
    > Please prepare your time and provide feedback if you've tried with the pre-release
code bases, thanks!
    > 
    > Best regards,
    > Ciyong
    > 
    > -----Original Message-----
    > From: Chen, Ciyong <ciyong.chen@intel.com> 
    > Sent: Monday, July 6, 2020 10:48 PM
    > To: dev@mxnet.apache.org
    > Cc: Bob Paulin <bob@apache.org>; Henri Yandell <bayard@apache.org>; Jason
Dai <jasondai@apache.org>; Markus Weimer <weimer@apache.org>; Michael Wall <mjwall@apache.org>
    > Subject: RE: [VOTE] Release Apache MXNet (incubating) version 1.7.0.rc0
    > 
    > For the language bindings and windows platform, may I have your support to help verify
these features? Thanks!
    > 
    > @lanking520 to help verify the Scala/Java @gigasquid to help verify the Clojure
    > @hetong007 to help verify the R
    > @yajiedesign to help verify the windows platform
    > 
    > Best regards,
    > Ciyong Chen
    > 
    > -----Original Message-----
    > From: Chen, Ciyong <ciyong.chen@intel.com>
    > Sent: Monday, July 6, 2020 10:39 PM
    > To: dev@mxnet.apache.org
    > Cc: Bob Paulin <bob@apache.org>; Henri Yandell <bayard@apache.org>; Jason
Dai <jasondai@apache.org>; Markus Weimer <weimer@apache.org>; Michael Wall <mjwall@apache.org>
    > Subject: [VOTE] Release Apache MXNet (incubating) version 1.7.0.rc0
    > 
    > Dear MXNet community,
    > 
    > This is the vote to release Apache MXNet (incubating) version 1.7.0. Voting will
start July 6, 23:59:59 PST and close on July 9, 23:59:59 PST.
    > 
    > Link to release notes:
    > https://cwiki.apache.org/confluence/display/MXNET/1.7.0+Release+notes
    > 
    > Link to release candidate:
    > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
    > 
    > Link to source and signatures on apache dist server:
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.7.0.rc0<https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.7.0.rc0/>
    > 
    > Please remember to TEST first before voting accordingly:
    > +1 = approve
    > +0 = no opinion
    > -1 = disapprove (provide reason)
    > 
    > Additional notes:
    > 
    >   *   There was an issue and discussion[1] regarding on a few numpy operators failed
due to numpy 1.19.0 released on Jun 20, 2020, which exists in all branches (works with numpy
<= 1.18.5). As numpy operator is still an experimental feature in 1.7.0 release and mainly
targeting in MXNet 2.0 release, so I decided to not block the voting and instead let the Community
decide whether this is a blocker for the release.
    > 
    > [1] https://github.com/apache/incubator-mxnet/issues/18600
    > 
    > Best regards,
    > Ciyong Chen
    > 
    > 

Mime
View raw message