mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshani Nagmote <roshaninagmo...@gmail.com>
Subject Re: Release blocker? - buggy topk Op
Date Thu, 16 Aug 2018 17:55:24 GMT
Thanks Leonard for raising this issue.
@ciyong Thanks for submitting the fix. I will be tracking mentioned PR
#12202 <https://github.com/apache/incubator-mxnet/pull/12202> for release.

-Roshani

On Thu, Aug 16, 2018 at 6:45 AM Zhao, Patric <patric.zhao@intel.com> wrote:

> Hi Leonard,
>
> Thanks to raising the issue of topk op.
>
> The root cause is from the current API design which used float data type
> to represent the integer index, and as we know, the float type could NOT
> express the large integer precisely.
> (I have no offense. I know I missed some backgrounds and I think the
> current design is very good).
>
> The new CI#12085 changes the computation order and make this issue looks
> more significant. Essentially, the bug will happen when the index is large
> whatever with or without the new CI.
> One line example code can trigger the issue,
> 'print(mx.nd.topk(mx.nd.array(np.arange(256*300096).reshape(8, -1)), k=4))'.
>
> Thus, the real fix is to change the API interface and use INT for the
> index. But it might introduce compatibility issue to current
> framework/topology due to API change.
> I am not sure we need to change in the last minutes of release 1.3
> (actually, we can contribute to it).
>
> Currently, we submitted a fix (#12202) to make the computation order as
> same as before and still much faster :)
>
> Apologies for the confusion and feel free to let us know for any feedback.
>
> Thanks,
>
> --Patric
>
>
> > -----Original Message-----
> > From: Leonard Lausen [mailto:l-software@lausen.nl]
> > Sent: Thursday, August 16, 2018 9:51 AM
> > To: dev@mxnet.incubator.apache.org
> > Subject: Release blocker? - buggy topk Op
> >
> > Recent changes in mxnet master introduced a bug into the topk operator.
> >  Below code example will output [ 274232. 179574. 274233. 274231.] with
> >  mxnet-cu90==1.3.0b20180810 but [ 274232. 179574. 274232. 274232.] with
> > mxnet-cu90==1.3.0b20180814. Likely #12085 is at fault.
> >
> > See https://github.com/apache/incubator-mxnet/issues/12197 for more
> info.
> >
> > I think this should be considered a release blocker for the 1.3 release.
> >
> > Note this breaks some parts of the KDD 18 MXNet / Gluon tutorial which is
> > scheduled for next Tuesday http://www.kdd.org/kdd2018/hands-on-
> > tutorials/view/mxnet-with-a-focus-on-nlp
> > . (We can work around by asking people to install the 0810 version
> > though.)
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message