tvm-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-tvm] roastduck commented on issue #5307: [TIR] Make lower_warp_memory support extent(threadIdx.x) < warp_size
Date Sun, 12 Apr 2020 05:38:07 GMT
roastduck commented on issue #5307: [TIR] Make lower_warp_memory support extent(threadIdx.x)
< warp_size
URL: https://github.com/apache/incubator-tvm/pull/5307#issuecomment-612567383
 
 
   One way to implement the 2nd approach is like this:
   
   1. In the beginning of the building processing, we first detect if there is any `extent(threadIdx.x)
< warp size`. If so, we modify the schedule. For example, if we get `extent(threadIdx.x)
== 16`, we first split `threadIdx.y` with factor 2, and then fuse that `2` with `threadIdx.x`,
so the extent of `threadIdx.x` becomes 32.
   2. Then we perform any lowering procedure with the assumption of `threadIdx.x == 32`.
   
   The good thing is that this approach may help reduce the complexity of any lowering procedures
related to warps, not limited to `lower_warp_memory`, but also other procedures.
   
   And the bad thing is it may be less intuitive to users, since the schedule written by them
are modified. It may hinder debugging, and debugging in TVM is already difficult. Note that
in Step 1 we have to modify `threadIdx.x` in all scopes, to keep the thread index consistent.
Here is a more complex example, which is simplified from the algorithm I am currently working
on.
   
   ```
   // a is shaped (n)
   // b is shaped (16)
   // c is shaped (n, 16)
   // extent of threadIdx.x == 16
   for (i.outer = 0; i.outer < n; i.outer += 16) {
       if (i.outer + threadIdx.x < n) {
           a.warp[i.outer + threadIdx.x] = a[i.outer + threadIdx.x]; // (1)
       }
       for (i.inner = 0; i.inner < min(16, n - i.outer); i.inner++) {
           c[i.outer + i.inner, threadIdx.x] += a.warp[i.outer + i.inner] * b[threadIdx.x];
// (2)
       }
   }
   ```
   
   `threadIdx.x` in both statement `(1)` and `(2)` will be fused to 32, which is a major modification
to the schedule. Users may meet difficulties to debug the program.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message