mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sheng Zha <szha....@gmail.com>
Subject Re: Release blocker: non-determinstic forward in gluon
Date Fri, 27 Jul 2018 18:30:55 GMT
Tong,

That's great news. I'm glad that OpenBLAS people are responding so quickly.
In that case it's probably a better idea to use that version instead. The
latest OpenBLAS version brings many optimization for all kinds of hardware.

-sz

On Fri, Jul 27, 2018 at 11:10 AM, Tong He <hetong007@gmail.com> wrote:

> Hi Sheng,
>
> I also opened an issue on OpenBLAS repo:
> https://github.com/xianyi/OpenBLAS/issues/1700 .
>
> As informed that "0.3.2 should be released this weekend", I tested their
> develope branch as well, and seems the new version has fixed the bug.
>
> Since OpenBLAS 0.3.2 could also have performance improvement, therefore I
> propose to wait for OpenBLAS 0.3.2 for our pip post release.
>
>
> Best regards,
>
> Tong He
>
> 2018-07-27 10:54 GMT-07:00 Sheng Zha <szha.pvg@gmail.com>:
>
> > Forgot to mention, the post release version is a pip package version.
> >
> > -sz
> >
> > > On Jul 27, 2018, at 10:42 AM, Sheng Zha <szha.pvg@gmail.com> wrote:
> > >
> > > In this case we can regard it as a release problem, which is usually
> > what post release versions are for. It’s still the same release with
> > different dependency, so there is no code change needed.
> > >
> > > -sz
> > >
> > >
> > >> On Jul 27, 2018, at 8:31 AM, Steffen Rochel <steffenrochel@gmail.com>
> > wrote:
> > >>
> > >> Hi Tong - thanks for root causing the problem.
> > >> Sheng - what is 1.2.1.post0? Shouldn't a patch with fix be released as
> > >> 1.2.2?
> > >> Steffen
> > >>
> > >>> On Thu, Jul 26, 2018 at 5:33 PM Sheng Zha <szha.pvg@gmail.com>
> wrote:
> > >>>
> > >>> Dear users and developers of Apache MXNet (Incubating),
> > >>>
> > >>> Thanks to Tong's dedication, the root cause for this issue was
> > identified
> > >>> to be instability in OpenBLAS's latest stable version 0.3.1. For
> > details,
> > >>> see Tong's comment
> > >>> <
> > >>> https://github.com/apache/incubator-mxnet/issues/11853#
> > issuecomment-408272772
> > >>>>
> > >>> .
> > >>>
> > >>> Since both the nightly build and the 1.2.1 wheels are affected, we
> > >>> recommend that we stay on OpenBLAS last known stable version 0.2.20
> > that
> > >>> we've been using. I will assume lazy consensus and prepare the fix
> > >>> (1.2.1.post0).
> > >>>
> > >>> -sz
> > >>>
> > >>>> On Tue, Jul 24, 2018 at 3:35 PM, Tong He <the@apache.org>
wrote:
> > >>>>
> > >>>> Recently there's an issue regarding the inconsistent result from
> gluon
> > >>>> forward:
> > >>>>
> > >>>> https://github.com/apache/incubator-mxnet/issues/11853
> > >>>>
> > >>>> Given a constant input image and loaded pretrained parameters,
we
> > expect
> > >>> a
> > >>>> deterministic output from arbitrary repeats of forwards. However
> from
> > the
> > >>>> issue I see that the forwarded result is non-determinstic. It is
> > harmful
> > >>> as
> > >>>> it makes the results from experments/benchmarks/inference
> > meaningless.
> > >>>>
> > >>>> Therefore I propose to block the 1.3 release before it gets
> resolved.
> > >>>>
> > >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message