mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Larroy <>
Subject Re: [apache/incubator-mxnet] [RFC][mxnet 2.0][item 10.1] MXNet Imperative Op Invocation Overhead (#17097)
Date Fri, 27 Dec 2019 19:54:54 GMT
Thanks for the explanation. I'm not so concerned about complexity of
dispatching. If I understood you correctly the main benefit that you
explain for the TVM project was not having to change the C API, but still
you need to do type checking in both ends, or at least on the receiving end
of the API, correct? I think we have discussed similar things in the past
and we might have different views on strongly typed vs dynamic typed. A
priori I prefer to see an API which can be evolved and changed, I find it
more explicit and clearer that what I think you do with PackedFun which I
have looked at briefly but not used extensively.  If one is going to call
into the C API using pybind, does it make sense to layer a C++ API on top
of the C API for this?

Also these microbenchmarks are nice, but we also need to consider the
overhead in typical workloads and see if it's still significant.

CFFI is also another alternative.

I couldn't access your pointers like:

On Thu, Dec 26, 2019 at 2:00 PM Tianqi Chen <>

> @larroy indeed every solution has trade-offs, and these tradeoffs are
> discussed in the above posts when we compare solutions, and they are backed
> by benchmarks :) it would be great if you can also suggest potential
> tradeoffs here.
> When you expose an API from typed language(c++) to a dynamic
> language(python), you have to type erase it, given that the python
> functions don't have the type, and you have to pass the information along.
> The only difference is where you do the type checking(that the python type
> corresponds to the right c++ type), and translation(translating to the c++
> type).
> For example, in the case of pybind, the erasure is done implicitly when
> you call the python function, then checking and translation happens when
> you call into the c++ function.
> In the case of creating a C API for each feature and wrap things in the
> python side, the type checking is done in the python side, and translation
> as well.
> In the case of tvm ffi, the type translation is done in the python/cython
> side,  while the type checking is done in the c++.
> To dive deeper into the tradeoffs for PackedFunc calling convention. The
> convention erases the type by having the type code stored into the
> arguments. This brings additional cost of passing arguments into heap, as
> opposed to registers. So they might not be designed for inline functions
> that needs to happen at the order of 1e-9s, however, for API functions that
> needs to run around 1e-7 or even 1e-8 level, this convention is pretty good.
> In terms of the calling cost, it really depends on whether the caller and
> callee are strongly typed.
> - If caller is strongly typed, then assigning type code is O(1)
> - If caller is a dynamic type(like python) then we need to have a
> dispatcher to dispatch and select the right type code
> - If callee is strongly typed, then the cost of checking is O(1) by just
> check the code to be the correct one
> - If the callee is dynamic type, then a dispatching need to happen, which
> have another level of hashtable lookup O(1)
> As we can see, the only place where dispatching is necessary is the
> dynamic type handling case. Even in these cases, if there is a strong need
> of specialization, we can directly force the type by running checking on
> the caller, and pass in the right type code (the engineering burden is the
> same as wrapping the C API). However, the benchmark suggests that the
> dynamic dispatching cost is reasonable, and satisfies the API speed.
> Coming back to the tradeoff, the main tradeoff here is the engineering
> burden to keep an hourglass design(with fixed set of API) vs efficiency.
> While my post did not suggest that TVM's ffi is a silver bullet, it does
> works pretty well for our use cases. hope it helps
> --
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly or view it on GitHub:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message