mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lv, Tao A" <>
Subject RE: A proposal for unified integration with external acceleration libraries
Date Mon, 04 Jun 2018 05:27:55 GMT

Hi Da and other developers,

It's a great idea to limit external acceleration libs into certain scope and subgraph. I am
not quite familiar with TVM and TensorRT's design. But from the side of MKL-DNN backend, here
are my concerns on this proposal:

1. Is subgraph for all third party acceleration libraries or just for those have different
data layouts? I guess cudnn are also using non-default data layout (say NHWC) for int8. So
does cudnn path also need follow this proposal? Since I notice that cudnn is not mentioned
in the proposal.
2. Would subgraph break the execution of imperative gluon interfaces? If we don't apply subgraph
to imperative gluon, does that mean imperative gluon models cannot benefit from any acceleration
3. Currently, most issues of mkldnn backend are from the interchange between mxnet default
ndarray and mkldnn memory. Even after subgraph is applied to mkldnn backend, there will still
have some fallback processes for those inputs which are not supported by mkldnn or those inputs
which are view of other tensors. So we still need deal with the layout transformation between
mkldnn specific layouts and mxnet default layout. We cannot avoid these with the current design
of subgraph.

For pushing mkldnn backend from 'experimental' to 'GA' in 1.3 release, we are working intensively
to add more unit tests and improve the stability of it. Hopefully, these fixes and tests will
upstream or be merged soon. Meanwhile, we are also trying to figure out how to improve the
subgraph solution for properly addressing current issues and better extendibility in the future.

Any comments and suggestions will be highly appreciated. Thanks.


-----Original Message-----
From: Zheng, Da [] 
Sent: Saturday, June 2, 2018 4:38 AM
Subject: A proposal for unified integration with external acceleration libraries

Hello all,

We would like to propose a new mechanism that unifies the integration with most of the external
acceleration libraries, including TVM, MKLDNN, TensorRT and more. The main idea is to integrate
with the external libraries in the level of subgraphs instead of operators.
There are a few reasons in favor of the new integration:

  *   Integration in the level of operators mixes the external library operators, such as
MKLDNN, with MXNet operators and makes the implementation of the executor overcomplicated.
We now have to deal with a lot of unexpected issues. (the executor needs to carefully deal
with data format conversion between different operators; the operators of external libraries
are subject to the same memory planning like other MXNet operaotrs, etc).
  *   External libraries need to reconstruct the computation graph for better performance
(e.g., operator fusion). Integration in the level of subgraphs allows external libraries to
perform arbitrary graph transformation and computation.

The proposal below provides both the design and the API for constructing subgraphs and executing

Please let me know if you have any comments on this design and API.

View raw message