tvm-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-tvm] roastduck commented on a change in pull request #5382: [TE] Fix MakeLoopNest for warp memory
Date Wed, 06 May 2020 06:24:44 GMT

roastduck commented on a change in pull request #5382:
URL: https://github.com/apache/incubator-tvm/pull/5382#discussion_r420568421



##########
File path: src/te/operation/op_util.cc
##########
@@ -164,9 +164,21 @@ MakeLoopNest(const Stage& stage,
         value_map[iv] = dom->min;
       } else {
         runtime::ThreadScope ts = runtime::ThreadScope::make(bind_iv->thread_tag);
-        if (stage->scope == "" || stage->scope == "warp" ||
+        if (stage->scope == "" ||
             static_cast<int>(runtime::StorageScope::make(stage->scope).rank) <=
ts.rank) {
           value_map[iv] = var;
+        } else if (stage->scope == "warp" && ts.rank == 1) {
+          // To determine whether a thread index is inside or outside a warp, we need
+          // to know the thread extent. We leave a warning for now.
+          if (ts.dim_index == 0) {
+            value_map[iv] = var;
+          } else {

Review comment:
       > threadIdx.x means a lower level of hierarchy than threadIdx.y and threadIdx.z
for "warp" scope -- "warp" memory is local to threadIdx.y and threadIdx.z, but shared by threadIdx.x.
One idea for you to consider is to change threadIdx.x's rank to 2 and keep threadIdx.y and
threadIdx.z's rank 1. It will avoid special casing "warp" here, but may cause other regression.
   
   I agree. Currently there are some hard-coded rank. It may be difficult to modify the rank
definition.
   
   > Does dom and/or bind_iv->dom have the extent for the checks?
   
   Yes, `dom` does. But where can we get the warp size? Warp size is not a constant value,
it is 32 for NVIDIA GPU and 64 for AMD GPU. The warp size information is stored in `Target`
class. Since `MakeLoopNest` is designed to be a target-irrelevant function, I don't think
it a good idea to add an argument to specify the target.
   
   > This example clarifies a lot. Is it possible to make it a test to cover the "warp"
specific code here?
   
   Actually I did try to run this example, but TVM somehow generated a correct code. It only
led to a wrong boundary checking bug in a very complex program.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message