tvm-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <>
Subject [GitHub] [incubator-tvm] SXM-inspur opened a new pull request #5450: Optimizations of global_ave_pool for NHWC layout
Date Mon, 27 Apr 2020 07:33:08 GMT

SXM-inspur opened a new pull request #5450:

       The runtime of global_ave_pool took about 14.8% in Resnet50_v2 with batchsize of 32,
when Tensor Core is enabled on Tesla T4 GPU. The runtime decreased to 0.134%, after optimizations
in this PR were made for NHWC layout. The results of unit tests are listed below, and the
latency is reported with unit of ms. As we can see from the table, great performance improvements
have been achieved.
   <table border="1">
      <th>After optimization</th>
   Table 1. Shape of input feature maps is batchx7x7x2048. 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

View raw message