tvm-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jun Yang <>
Subject Re: [dmlc/tvm] [RFC] Support Tensorflow Op Bridge (#3059)
Date Sun, 21 Apr 2019 05:34:19 GMT
@kovasb Nice to see your interest into our TVM&TF NMT article:)

Also we have had some internal discussions regarding to adding non-TF DL compiler backend
into TF as a complementary for XLA, and TVM is absolutely one of the great choices.

There are some principles I think we might need to follow to ensure the smooth integration:
1. TVM related support should be placed as a standalone github repository to ensure the loose
coupling between TF and TVM;
2. The concrete method to achieve this loose coupling is to leverage TF's graph optimization
registration mechanism, which will be invoked at TF runtime.
3. A new graph pass can be added based on TF graph optimization framework(just the same as
TF XLA's MarkForCompilation, EncapsulateSubGraph, BuildXLALaunchOp) which can recognize some
portions of the TF graph which we think might benefit from TVM backend and then cluster these
TF operations into a TF2TVMCompilation(or some other name) sub-graph and finally replace those
clustered ops with a TF2TVMBridgeOp macro op.
4. During the initial run of TF2TVMBridgeOp, compile the underlying TF ops into backend executables
through TVM infrastructure, to ensure the smoothness of compilation phase, an extra IR  layer
may be necessary in addition to TVM's own IR architecture, this should be open for design
5. After the initial run of TF2TVMBridgeOp, for the following runs, the compiled executable
can be directly invoked. Another round of compilation may be necessary when the input data
shape of the TF2TVMBridgeOp changes(although TVM provides native support for dynamic shape,
we may wish to tease performance boundary through static shape information)
6. The initial scenario I personally think TVM can complement TF and XLA is its native supporting
mechanism for compute-intensive operations, such as GEMM/Conv, which might be a good starting
point. For non-compute-intensive operations(such as add/mul/reduce/transpose, etc.), I think
XLA currently already provides good mechanism support, and we could follow XLA's infrastructure
to optimize these non-compute-intensive operations directly.

There are some scenarios we estimated to be suitable for this feature and already started
the design and refine work. If you have any interests, it would be highly appreciated to provide
your concrete use case or jump into the design&discussion directly. 


You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message