mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skalicky, Sam" <sska...@amazon.com.INVALID>
Subject Re: [VOTE] Release Apache MXNet (incubating) version 1.8.0.rc0
Date Thu, 01 Oct 2020 04:53:33 GMT
Thanks Leonard for picking up this work. Are you planning to open another PR that commits these
PRs into v1.x too so this doesn’t happen again (if we ever release a 1.9 version)? 

Other than these 2 PRs are there any others that are required for the v1.8.0 release?

https://github.com/apache/incubator-mxnet/pull/19251
https://github.com/apache/incubator-mxnet/pull/19262

Sam

On 9/30/20, 9:03 PM, "Leonard Lausen" <lausen@apache.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or
open attachments unless you can confirm the sender and know the content is safe.



    Thank you Sam for driving the release!

    I took a quick look at the missing commits from v1.7.x in v1.8.x via the git
    cherry mode and applied them to v1.8.x. Please see
    https://github.com/apache/incubator-mxnet/pull/19262

    The missing code in v1.8.x is substantial (+675 −518) and I thus change my vote
    for the rc0 release to -1.

    I hope we can include checking for missing commits via git cherry mode in the
    release manager process going forward. It just takes a few minutes. If we want
    to streamline the process, we can do so by avoiding to squash commits during
    porting from one branch to another which reduces false positives in git cherry
    mode (commits detected as missing that were actually ported).

    Best regards
    Leonard

    On Wed, 2020-09-30 at 21:52 +0000, Skalicky, Sam wrote:
    > Hi MXNet Community,
    >
    > Quick summary on the status of the vote:
    >
    > 2  +1
    > 1 -0.9
    >
    > I spoke with Leonard offline, and the problem only impacts the specific
    > instance when running MKLDNN/oneDNN immediately after intgemm. We don’t expect
    > users to fall into this specific edge case, and so far the problem hasn’t been
    > reproduced on 1.8.x (even through it contains the same oneDNN and intgemm
    > components that are in the master branch). He proposed to not postpone the
    > release for this issue, but if other issues arise we should fix this one at
    > the same time.
    >
    > There are also still missing PRs that were in v1.7.x that were never committed
    > to v1.x branch. And so when branching from v1.x to create the v1.8.x branch
    > these PRs do not exist. Unfortunately no one has volunteered to port these to
    > v1.x and v1.8.x branches.
    >
    > I propose extending the vote until Friday October 2, 23:59:59 PDT to conclude
    > the discussion and get the remaining votes necessary.
    >
    > Thanks!
    > Sam
    >
    > On 9/29/20, 12:41 PM, "Skalicky, Sam" <sskalic@amazon.com.INVALID> wrote:
    >
    >     There was no response from the community on the discussion thread [1]. So
    > the current state is the same.
    >
    >     [1]
    > https://lists.apache.org/thread.html/r31d491150029734c6041c1ae21929cd667eed27f590262c3f501c6b7%40%3Cdev.mxnet.apache.org%3E
    >
    >     On 9/29/20, 11:36 AM, "Xingjian SHI" <xshiab@connect.ust.hk> wrote:
    >
    >         CAUTION: This email originated from outside of the organization. Do
    > not click links or open attachments unless you can confirm the sender and know
    > the content is safe.
    >
    >
    >
    >         Just one question regarding the 1.8.0.rc0. Are all PRs that are in
    > 1.7.0 included in 1.8.0? For example,
    > https://github.com/apache/incubator-mxnet/pull/18653
    >
    >         Thanks,
    >         Xingjian
    >
    >         On 9/29/20, 10:20 AM, "Leonard Lausen" <lausen@apache.org> wrote:
    >
    >             Thank you Aaron for trying the build and pointing out the issues.
    >
    >             On Mon, 2020-09-28 at 18:30 -0700, Aaron Markham wrote:
    >             > 2) Tried just doing a make. This fails because none of the
    > submodules are
    >             > there. [...]
    >
    >             I downloaded the rc from the link shared by Sam [1] and it does
    > include the
    >             submodules. Could you provide more details on your issue?
    >
    >             > Downloaded the tar.gz for the release and looked at the build
    > from
    >             source directions on the website, but these have you use cmake and
    > don't
    >             really tell you what to do...
    >
    >             The docs refer users to version-controlled files, as the build-
    > from-source guide
    >             on the website is shared among all versions, however the actual
    > build steps
    >             differes on different versions. I think the best way to improve it
    > is to provide
    >             version-specific build from source instructions via the "version
    > selector"
    >             feature on the get started page. Contributions towards this goal
    > or other
    >             improvements would be great [2].
    >
    >             Thanks
    >             Leonard
    >
    >             [1]:
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    >             [2]: https://github.com/apache/incubator-mxnet/issues/18666
    >
    >
    > On 9/29/20, 10:09 AM, "Leonard Lausen" <lausen@apache.org> wrote:
    >
    >     CAUTION: This email originated from outside of the organization. Do not
    > click links or open attachments unless you can confirm the sender and know the
    > content is safe.
    >
    >
    >
    >     Vote -0.9.
    >
    >     Piotr has clarified that onednn 1.6.3 (included in MXNet 1.8 rc0) wrongly
    >     handles zmm registers. Together with MXNet intgemm feature (also included
    > in 1.8
    >     rc0) this can yield NaN results if onednn gemm is executed some time after
    >     intgemm. [1]
    >
    >     Thanks
    >     Leonard
    >
    >     [1]:
    > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-700603056
    >             >
    >             >
    >             > On Mon, Sep 28, 2020 at 11:36 AM Skalicky, Sam <
    > sskalic@amazon.com.invalid>
    >             > wrote:
    >             >
    >             > > Thanks for pointing this out Leonard. Has anyone been able
to
    > reproduce
    >             > > the problem on 1.8.0.rc0?
    >             > >
    >             > > Either way, I would proposed that we continue validating the
    > release as-is
    >             > > and see if we can find any other issues.
    >             > >
    >             > > Sam
    >             > >
    >             > > On 9/28/20, 10:22 AM, "Leonard Lausen" <lausen@apache.org>
    > wrote:
    >             > >
    >             > >     CAUTION: This email originated from outside of the
    > organization. Do
    >             > > not click links or open attachments unless you can confirm
the
    > sender and
    >             > > know the content is safe.
    >             > >
    >             > >
    >             > >
    >             > >     Thank you Sam for driving the 1.8 release!
    >             > >
    >             > >     As the included oneDNN package is known to produce nan
    > results on the
    >             > > master
    >             > >     branch [1] and is pending an upstream fix by Intel, I'd
    > suggest to
    >             > > extend the
    >             > >     vote until we have clarity if the bug also affects the
1.8
    > release,
    >             > > given that
    >             > >     oneDNN is enabled in the default configuration [2].
    >             > >
    >             > >     [1]:
    >             > >
    > https://github.com/apache/incubator-mxnet/pull/19185#issuecomment-698274033
    >             > >     [2]:
    >             > >
    >             > >
    > https://lists.apache.org/thread.html/1a22dbd79098adab6d02d16e8d607bae2acc908c0bb1b085d28a51ba@%3Cdev.mxnet.apache.org%3E
    >             > >
    >             > >     On Sun, 2020-09-27 at 18:28 -0700, sandeep krishnamurthy
    > wrote:
    >             > >     > Sam,
    >             > >     >
    >             > >     > Thank you for driving the v1.8.0 release of MXNet.
This
    > is exciting
    >             > > given
    >             > >     > it is coming with CUDA11 and cuDNN8!!
    >             > >     >
    >             > >     > Fixing the release candidate link:
    >             > >     >
    > https://github.com/apache/incubator-mxnet/releases/tag/1.8.0.rc0
    >             > >     >
    >             > >     > Best,
    >             > >     > Sandeep
    >             > >     >
    >             > >     >
    >             > >     > On Fri, Sep 25, 2020 at 11:24 PM Skalicky, Sam
    >             > > <sskalic@amazon.com.invalid>
    >             > >     > wrote:
    >             > >     >
    >             > >     > > Dear MXNet community,
    >             > >     > >
    >             > >     > > This is the vote to release Apache MXNet (incubating)
    > version
    >             > > 1.8.0.
    >             > >     > > Voting will start September 26, 23:59:59 PDT
and close
    > on
    >             > > September 29,
    >             > >     > > 23:59:59 PDT.
    >             > >     > >
    >             > >     > > Link to release notes:
    >             > >     > >
    >             > >
    > https://cwiki.apache.org/confluence/display/MXNET/1.8.0+Release+Notes
    >             > >     > >
    >             > >     > > Link to release candidate:
    >             > >     > >
    > https://github.com/apache/incubator-mxnet/releases/tag/1.7.0.rc0
    >             > >     > >
    >             > >     > > Link to source and signatures on apache dist
server:
    >             > >     > >
    > https://dist.apache.org/repos/dist/dev/incubator/mxnet/1.8.0.rc0/
    >             > >     > >
    >             > >     > > Please remember to TEST first before voting
    > accordingly:
    >             > >     > > +1 = approve
    >             > >     > > +0 = no opinion
    >             > >     > > -1 = disapprove (provide reason)
    >             > >     > >
    >             > >     > > Best regards,
    >             > >     > > Sam Skalicky
    >             > >     > >
    >             > >     > >
    >             > >     >
    >             > >     > --
    >             > >     > Sandeep Krishnamurthy
    >             > >
    >             > >
    >             > >
    >
    >
    >
    >


Mime
View raw message