mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tong He <hetong...@gmail.com>
Subject Re: Release blocker: non-determinstic forward in gluon
Date Fri, 27 Jul 2018 18:10:38 GMT
Hi Sheng,

I also opened an issue on OpenBLAS repo:
https://github.com/xianyi/OpenBLAS/issues/1700 .

As informed that "0.3.2 should be released this weekend", I tested their
develope branch as well, and seems the new version has fixed the bug.

Since OpenBLAS 0.3.2 could also have performance improvement, therefore I
propose to wait for OpenBLAS 0.3.2 for our pip post release.


Best regards,

Tong He

2018-07-27 10:54 GMT-07:00 Sheng Zha <szha.pvg@gmail.com>:

> Forgot to mention, the post release version is a pip package version.
>
> -sz
>
> > On Jul 27, 2018, at 10:42 AM, Sheng Zha <szha.pvg@gmail.com> wrote:
> >
> > In this case we can regard it as a release problem, which is usually
> what post release versions are for. It’s still the same release with
> different dependency, so there is no code change needed.
> >
> > -sz
> >
> >
> >> On Jul 27, 2018, at 8:31 AM, Steffen Rochel <steffenrochel@gmail.com>
> wrote:
> >>
> >> Hi Tong - thanks for root causing the problem.
> >> Sheng - what is 1.2.1.post0? Shouldn't a patch with fix be released as
> >> 1.2.2?
> >> Steffen
> >>
> >>> On Thu, Jul 26, 2018 at 5:33 PM Sheng Zha <szha.pvg@gmail.com> wrote:
> >>>
> >>> Dear users and developers of Apache MXNet (Incubating),
> >>>
> >>> Thanks to Tong's dedication, the root cause for this issue was
> identified
> >>> to be instability in OpenBLAS's latest stable version 0.3.1. For
> details,
> >>> see Tong's comment
> >>> <
> >>> https://github.com/apache/incubator-mxnet/issues/11853#
> issuecomment-408272772
> >>>>
> >>> .
> >>>
> >>> Since both the nightly build and the 1.2.1 wheels are affected, we
> >>> recommend that we stay on OpenBLAS last known stable version 0.2.20
> that
> >>> we've been using. I will assume lazy consensus and prepare the fix
> >>> (1.2.1.post0).
> >>>
> >>> -sz
> >>>
> >>>> On Tue, Jul 24, 2018 at 3:35 PM, Tong He <the@apache.org> wrote:
> >>>>
> >>>> Recently there's an issue regarding the inconsistent result from gluon
> >>>> forward:
> >>>>
> >>>> https://github.com/apache/incubator-mxnet/issues/11853
> >>>>
> >>>> Given a constant input image and loaded pretrained parameters, we
> expect
> >>> a
> >>>> deterministic output from arbitrary repeats of forwards. However from
> the
> >>>> issue I see that the forwarded result is non-determinstic. It is
> harmful
> >>> as
> >>>> it makes the results from experments/benchmarks/inference
> meaningless.
> >>>>
> >>>> Therefore I propose to block the 1.3 release before it gets resolved.
> >>>>
> >>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message