mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhao, Patric" <patric.z...@intel.com>
Subject RE: Release blocker? - buggy topk Op
Date Thu, 16 Aug 2018 13:44:43 GMT
Hi Leonard,

Thanks to raising the issue of topk op.

The root cause is from the current API design which used float data type to represent the
integer index, and as we know, the float type could NOT express the large integer precisely.
(I have no offense. I know I missed some backgrounds and I think the current design is very
good).

The new CI#12085 changes the computation order and make this issue looks more significant.
Essentially, the bug will happen when the index is large whatever with or without the new
CI. 
One line example code can trigger the issue, 'print(mx.nd.topk(mx.nd.array(np.arange(256*300096).reshape(8,
-1)), k=4))'.

Thus, the real fix is to change the API interface and use INT for the index. But it might
introduce compatibility issue to current framework/topology due to API change.
I am not sure we need to change in the last minutes of release 1.3 (actually, we can contribute
to it).

Currently, we submitted a fix (#12202) to make the computation order as same as before and
still much faster :)

Apologies for the confusion and feel free to let us know for any feedback.

Thanks,

--Patric


> -----Original Message-----
> From: Leonard Lausen [mailto:l-software@lausen.nl]
> Sent: Thursday, August 16, 2018 9:51 AM
> To: dev@mxnet.incubator.apache.org
> Subject: Release blocker? - buggy topk Op
> 
> Recent changes in mxnet master introduced a bug into the topk operator.
>  Below code example will output [ 274232. 179574. 274233. 274231.] with
>  mxnet-cu90==1.3.0b20180810 but [ 274232. 179574. 274232. 274232.] with
> mxnet-cu90==1.3.0b20180814. Likely #12085 is at fault.
> 
> See https://github.com/apache/incubator-mxnet/issues/12197 for more info.
> 
> I think this should be considered a release blocker for the 1.3 release.
> 
> Note this breaks some parts of the KDD 18 MXNet / Gluon tutorial which is
> scheduled for next Tuesday http://www.kdd.org/kdd2018/hands-on-
> tutorials/view/mxnet-with-a-focus-on-nlp
> . (We can work around by asking people to install the 0810 version
> though.)


Mime
View raw message