mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skalicky, Sam" <sska...@amazon.com.INVALID>
Subject Re: Proposal to Drop Cuda 10 in MXNet 2.0
Date Thu, 15 Oct 2020 01:28:05 GMT
Thanks for starting this discussion Zhaoqi!

Previously we had decided to support 2 major versions of CUDA, since we recently dropped support
for CUDA 9 now we only support CUDA 10 & 11. Im not opposed to changing the standard,
but we should think about the impact for users. Upgrading is painful (especially when forced).


For your case, maybe the best solution would be to use #if pragmas to start using the new
CUDA 11 features where possible and leave the existing code until we decide to deprecate CUDA
10. 

Sam

´╗┐On 10/14/20, 5:49 PM, "Zhu Zhaoqi" <zhaoqizhu96@gmail.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or
open attachments unless you can confirm the sender and know the content is safe.



    Hi MXNet Community,

    I would like to bring up the idea of dropping Cuda 10 support for MXNet
    2.0. Currently we are supporting cu101, cu102, and cu110.

    Supporting only Cuda 11 and onward would benefit future development hugely
    as C++ 17 features, such as constexpr, are supported by the new nvcc now.
    We have updated our build tools so C++ 17 has been already unblocked there.
    Also as Leonard points out to me offline, the newly added libcu++ also
    allows us to use c++ standard libraries on GPU

    Personally, I have a use case for c++ 17 static branching. I was trying to
    optimize the argmin/max operator in the past week and needed to tweak the
    GPU axes reduction kernel to support a custom struct (which stores both idx
    and num). Without *constexpr if* the following code will not compile since
    OP can either take one or two parameters

    > if (use_index)
    >   data = OP(val, index)
    > else
    >   data = OP(val)
    >

    The workarounds would be to either 1. branch using template and write two
    kernels or 2. use some memory hack like this:

    > *(reinterpret_cast<int*>(&data)) = index; // here data.idx is the first
    > member variable
    >

    Neither is a good solution as they will bloat or obfuscate our code base.
    However, with c++ 17 support we can do it in a much cleaner way.

    > if constexpr(use_index)
    >   data = OP(val, index)
    > else
    >   data = OP(val) // won't compile
    >

    In general, I think the added language features will help keep our code
    base clean and speed optimization and functionality extension work. With
    that said, let's use this thread to collect opinions on future Cuda support.

    Thanks,
    Zhaoqi

    ref: cuda 11 release note
    https://developer.nvidia.com/blog/cuda-11-features-revealed/

Mime
View raw message