mxnet-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] DickJC123 commented on a change in pull request #14006: Dual stream cudnn Convolution backward() with MXNET_GPU_WORKER_NSTREAMS=2.
Date Mon, 18 Feb 2019 23:37:14 GMT
DickJC123 commented on a change in pull request #14006: Dual stream cudnn Convolution backward()
with MXNET_GPU_WORKER_NSTREAMS=2.
URL: https://github.com/apache/incubator-mxnet/pull/14006#discussion_r257851551
 
 

 ##########
 File path: docs/faq/env_var.md
 ##########
 @@ -174,6 +174,12 @@ When USE_PROFILER is enabled in Makefile or CMake, the following environments
ca
 
 ## Other Environment Variables
 
+* MXNET_GPU_WORKER_NSTREAMS
 
 Review comment:
   The short answer is that 'yes', an operator with 3 inputs might make use of 3 streams in
Backward(), so I did not want to propose an environment variable name like MXNET_GPU_WORKER_USE_DUAL_STREAM=0/1
that might soon become obsolete.  On the other hand, Convolution only needs 2 streams, and
I did not want to burden this enhancement with more complexity than is needed at this time.
 I propose that when we have a use-case for 3 or more streams, then we can expand the implementation
and employ the use-case in our testing of it.
   
   At the end of every kernel execution, there is a fall-off in GPU utilization leading up
to the completion of the last grid block.  When two streams are being used, these utilization
gaps can be filled by work from the second stream.  I would guess that having 3 streams would
not enhance this effect.  On the other hand, let's say you had 3 small independent kernels
that each would occupy a third of the GPU.  You could see how having 3 streams would be a
win in this case over 2 streams.
   
   So it's good that you ask, how might we expand this to 3 or more streams?  The MXNET_GPU_WORKER_NSTREAMS
environment variable would remain unchanged, though the documentation would indicate that
the framework supports a value greater than 2.  Legacy env-var uses would be preserved so
I think this could happen as part of a minor release.  At the RunContext level, a GPUAuxStream*
would be replaced by a std::vector<GPUAuxStream*>.  The RunContext method get_gpu_aux_stream()
might then be changed to RunContex::get_gpu_aux_stream(int aux_stream_id = 0), which would
not break operator code that started using the simpler aux_stream API proposed by this PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message