tvm-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-tvm] cbalint13 opened a new pull request #5805: [QUANTIZE] Add nn.batch_flatten as quantizable.
Date Sun, 14 Jun 2020 07:14:48 GMT

cbalint13 opened a new pull request #5805:
URL: https://github.com/apache/incubator-tvm/pull/5805


   This PR adds ```nn.batch_flatten``` as quantizable layer.
   
   **Description**
   * ```nn/batch_flatten``` is commonly used before ```nn.dense``` in final layers.
   * Proposed PR allows it to be included in quantization process avoiding re-cast to ```float32```.
   
   **Outcome**
   * Before
   ```
     %19 = nn.max_pool2d(%18, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 0, 0]) /* ty=Tensor[(1,
50, 4, 4), int8] */;
     %20 = cast(%19, dtype="int8") /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %21 = annotation.stop_fusion(%20) /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %22 = cast(%21, dtype="float32") /* ty=Tensor[(1, 50, 4, 4), float32] */;
     %23 = multiply(%22, 0.0625f /* ty=float32 */) /* ty=Tensor[(1, 50, 4, 4), float32] */;
     %24 = nn.batch_flatten(%23) /* ty=Tensor[(1, 800), float32] */;
     %25 = nn.batch_flatten(%24) /* ty=Tensor[(1, 800), float32] */;
     %26 = nn.batch_flatten(%25) /* ty=Tensor[(1, 800), float32] */;
     %27 = nn.dense(%26, meta[relay.Constant][2] /* ty=Tensor[(512, 800), float32] */ /* ty=Tensor[(512,
800), float32] */, units=512) /* ty=Tensor[(1, 512), float32] */;
     %28 = nn.relu(%27) /* ty=Tensor[(1, 512), float32] */;
     %29 = nn.batch_flatten(%28) /* ty=Tensor[(1, 512), float32] */;
     %30 = nn.batch_flatten(%29) /* ty=Tensor[(1, 512), float32] */;
     nn.dense(%30, meta[relay.Constant][3] /* ty=Tensor[(10, 512), float32] */ /* ty=Tensor[(10,
512), float32] */, units=10) /* ty=Tensor[(1, 10), float32] */
   ```
   * After
   ```
     %19 = nn.max_pool2d(%18, pool_size=[2, 2], strides=[2, 2], padding=[0, 0, 0, 0]) /* ty=Tensor[(1,
50, 4, 4), int8] */;
     %20 = cast(%19, dtype="int8") /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %21 = annotation.stop_fusion(%20) /* ty=Tensor[(1, 50, 4, 4), int8] */;
     %22 = nn.batch_flatten(%21) /* ty=Tensor[(1, 800), int8] */;
     %23 = nn.batch_flatten(%22) /* ty=Tensor[(1, 800), int8] */;
     %24 = nn.batch_flatten(%23) /* ty=Tensor[(1, 800), int8] */;
     %25 = clip(%24, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 800), int8] */;
     %26 = nn.dense(%25, meta[relay.Constant][2] /* ty=Tensor[(512, 800), int8] */ /* ty=Tensor[(512,
800), int8] */, units=512, out_dtype="int32") /* ty=Tensor[(1, 512), int32] */;
     %27 = nn.relu(%26) /* ty=Tensor[(1, 512), int32] */;
     %28 = nn.batch_flatten(%27) /* ty=Tensor[(1, 512), int32] */;
     %29 = nn.batch_flatten(%28) /* ty=Tensor[(1, 512), int32] */;
     %30 = add(%29, 512 /* ty=int32 */) /* ty=Tensor[(1, 512), int32] */;
     %31 = right_shift(%30, 10 /* ty=int32 */) /* ty=Tensor[(1, 512), int32] */;
     %32 = clip(%31, a_min=-127f, a_max=127f) /* ty=Tensor[(1, 512), int32] */;
     %33 = cast(%32, dtype="int8") /* ty=Tensor[(1, 512), int8] */;
     %34 = nn.dense(%33, meta[relay.Constant][3] /* ty=Tensor[(10, 512), int8] */ /* ty=Tensor[(10,
512), int8] */, units=10, out_dtype="int32") /* ty=Tensor[(1, 10), int32] */;
   ```
   @vinx13, @siju-samuel @masahi @FrozenGene @ZihengJiang please help with the review.
   
   Thank You !


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message