mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dick Carter <>
Subject Re: [apache/incubator-mxnet] [RFC] v1.8.0 release (#18800)
Date Wed, 05 Aug 2020 04:16:36 GMT
A major feature of CUDA 11 and cuDNN 8.0 is support for the new A100 GPU and its TensorFloat-32
(TF32) mode of computation.  I would like to include PR,
"Unittest tolerance handling improvements", which allows MXNet to use TF32 effectively.  The
PR also makes sensible adjustments to the unittest tolerances based on device context and
dtype, ensuring A100 compatibility with our unittest suite.

With cuDNN 8.0 also comes compatibility with CUDA Graph Capture- I would like to include a
PR (near complete, but not yet submitted) that enables CUDA Graph use.  This will permit MXNet
to bypass much of the CPU preparation for launching identical kernel sequences, as are commonly
seen in many deep learning training and inferencing environments.

You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message