mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anirudh Subramanian <>
Subject Re: [apache/incubator-mxnet] [RFC] MXNet Multithreaded Inference Interface (#16431)
Date Thu, 10 Oct 2019 23:10:09 GMT
Thanks @marcoabreu ! 

> Will the new C-API functions be threadsafe in general? Speak, I can invoke them at any
point in time from any thread without the need of a lock, sticky-thread or a thread hierarchy?
(I'm thinking of the thread-safety being done on the backend level)

The issue I found with C API thread safety especially with the cached op use case was the
ThreadLocalStore. If we fix this issue then C APIs related to CreateCachedOp and InvokeCachedOp
should be threadsafe.

>  Will this also support the GPU use-case? Speak, the parameters are only copied into
GPU memory once in the same fashion as you're describing for the CPU?

This should still support the single GPU use-case for 1.6. Multi GPU inference use case requires
more verification at the cached op level .

> Do you think there's a path forward to make all inference-related C-APIs threadsafe instead
of splitting off another execution branch?

I don't think we have such a strict split between inference and training APIs at the C API
level. For example for gluon cached op we call InvokeCachedOp for both training and Inference.

But if I rephrase your question to:
Will I be able to do multi threaded inference from every frontend API which I can use to do
inference today ?  
Right now, I am targeting only gluon since most users have been directed towards gluon. The
other ways are using module, symbolic and using C Predict API. To support these two frontend
APIs requires the graph executor to be thread safe.  This would definitely be a great add
for MXNet since it would ensure that they can do multi-threaded inference from any of these
APIs in MXNet, but not something I have planned for currently.

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message