tvm-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [incubator-tvm] sewardto commented on issue #4972: Performance regression of quantization on CUDA after [Relay][AutoTVM] Relay op strategy (#4644)
Date Sat, 07 Mar 2020 02:13:49 GMT
sewardto commented on issue #4972: Performance regression of quantization on CUDA after [Relay][AutoTVM]
Relay op strategy (#4644) 
URL: https://github.com/apache/incubator-tvm/issues/4972#issuecomment-596034110
 
 
   After auto-tuning on 1070 Max-Q, the speed is much more faster:
   ```
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:92:
Iteration: 0
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #0 fused_nn_conv2d_multiply_add_nn_relu: 180.928 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #1 fused_nn_max_pool2d_1: 29.4735 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #2 fused_multiply_round_clip_cast: 14.2788 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_:
78.2669 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3:
83.7044 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #5 fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_:
16.1587 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #6 fused_cast_25: 12.3677 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1:
80.4706 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #8 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4:
83.8792 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #9 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2:
15.3279 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #10 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2:
27.4534 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #11 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2:
90.9006 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #12 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5:
101.824 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #13 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2:
12.3551 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #14 fused_cast_24: 10.8766 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #15 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3:
100.764 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #16 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6:
101.819 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #17 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1:
12.1372 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #18 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1:
37.6922 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #19 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4:
115.165 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #20 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7:
150.852 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #21 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1:
10.9095 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #22 fused_cast_23: 9.8539 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #23 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5:
150.405 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #24 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8:
150.738 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #25 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_:
11.0084 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #26 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_:
36.0865 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #27 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6:
161.054 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #28 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9:
252.2 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #29 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_:
9.9324 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #30 fused_cast_22: 9.2837 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #31 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7:
252.951 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #32 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10:
252.633 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #33 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast: 9.8974 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #34 fused_nn_global_avg_pool2d_cast_multiply: 12.1224 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #35 fused_nn_batch_flatten_nn_batch_flatten_multiply: 9.1246 us/iter
   [10:09:15] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97:
Op #36 fused_nn_dense_nn_bias_add: 22.1244 us/iter
   Node Name                                                                             
                    Ops                                                                  
                                     Time(us)  Time(%)  Shape              Inputs  Outputs
 
   ---------                                                                             
                    ---                                                                  
                                     --------  -------  -----              ------  -------
 
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7
  252.951   9.31     (1, 512, 7, 7)     4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10
 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10
 252.633   9.298    (1, 512, 7, 7)     4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9
  252.2     9.282    (1, 512, 7, 7)     4       1        
   fused_nn_conv2d_multiply_add_nn_relu                                                  
                    fused_nn_conv2d_multiply_add_nn_relu                                 
                                     180.928   6.659    (1, 64, 112, 112)  4       1     
  
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6
  161.054   5.928    (1, 512, 7, 7)     4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7
  150.852   5.552    (1, 256, 14, 14)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8
  150.738   5.548    (1, 256, 14, 14)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5
  150.405   5.536    (1, 256, 14, 14)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4
  115.165   4.239    (1, 256, 14, 14)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5
  101.824   3.748    (1, 128, 28, 28)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6
  101.819   3.747    (1, 128, 28, 28)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3
  100.764   3.709    (1, 128, 28, 28)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2
  90.901    3.346    (1, 128, 28, 28)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4
  83.879    3.087    (1, 64, 56, 56)    4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3
  83.704    3.081    (1, 64, 56, 56)    4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1
  80.471    2.962    (1, 64, 56, 56)    4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_
    fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_
    78.267    2.881    (1, 64, 56, 56)    4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1
  37.692    1.387    (1, 256, 14, 14)   4       1        
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_
    fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_
    36.087    1.328    (1, 512, 7, 7)     4       1        
   fused_nn_max_pool2d_1                                                                 
                    fused_nn_max_pool2d_1                                                
                                     29.474    1.085    (1, 64, 56, 56)    1       1     
  
   fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2
  fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2
  27.453    1.01     (1, 128, 28, 28)   4       1        
   fused_nn_dense_nn_bias_add                                                            
                    fused_nn_dense_nn_bias_add                                           
                                     22.124    0.814    (1, 1000)          3       1     
  
   fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_
     fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_
     16.159    0.595    (1, 64, 56, 56)    2       1        
   fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2
   fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2
   15.328    0.564    (1, 64, 56, 56)    2       1        
   fused_multiply_round_clip_cast                                                        
                    fused_multiply_round_clip_cast                                       
                                     14.279    0.526    (1, 64, 56, 56)    1       1     
  
   fused_cast_25                                                                         
                    fused_cast_25                                                        
                                     12.368    0.455    (1, 64, 56, 56)    1       1     
  
   fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2
  fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2
  12.355    0.455    (1, 128, 28, 28)   2       1        
   fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1
   fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1
   12.137    0.447    (1, 128, 28, 28)   2       1        
   fused_nn_global_avg_pool2d_cast_multiply                                              
                    fused_nn_global_avg_pool2d_cast_multiply                             
                                     12.122    0.446    (1, 512, 1, 1)     1       1     
  
   fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_
     fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_
     11.008    0.405    (1, 256, 14, 14)   2       1        
   fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1
  fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1
  10.909    0.402    (1, 256, 14, 14)   2       1        
   fused_cast_24                                                                         
                    fused_cast_24                                                        
                                     10.877    0.4      (1, 128, 28, 28)   1       1     
  
   fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_
    fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_
    9.932     0.366    (1, 512, 7, 7)     2       1        
   fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast             
                    fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast
                                 9.897     0.364    (1, 512, 7, 7)     2       1        
   fused_cast_23                                                                         
                    fused_cast_23                                                        
                                     9.854     0.363    (1, 256, 14, 14)   1       1     
  
   fused_cast_22                                                                         
                    fused_cast_22                                                        
                                     9.284     0.342    (1, 512, 7, 7)     1       1     
  
   fused_nn_batch_flatten_nn_batch_flatten_multiply                                      
                    fused_nn_batch_flatten_nn_batch_flatten_multiply                     
                                     9.125     0.336    (1, 512)           1       1     
  
   Total_time                                                                            
                    -                                                                    
                                     2717.019  -        -                  -       -     
  
   ```
   The log file follows.
   ```
   {"input": ["cuda -model=unknown", "conv2d_nchw_winograd.cuda", [["TENSOR", [1, 512, 7,
7], "int8"], ["TENSOR", [512, 512, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"],
{}], "config": {"index": 267206, "code_hash": null, "entity": [["tile_b", "sp", [-1, 1, 1,
1]], ["tile_y", "sp", [-1, 1, 16, 4]], ["tile_x", "sp", [-1, 1, 8, 2]], ["tile_rc", "sp",
[-1, 16]], ["auto_unroll_max_step", "ot", 0], ["unroll_explicit", "ot", 1]]}, "result": [[0.00010122895650355499],
0, 2.20457124710083, 1583490003.2625034], "version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 512, 7, 7], "int8"],
["TENSOR", [512, 512, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 684894, "code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 16, 1]], ["tile_y",
"sp", [-1, 7, 1, 1]], ["tile_x", "sp", [-1, 1, 7, 1]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 512], ["unroll_explicit",
"ot", 1]]}, "result": [[0.00016264779663299663], 0, 2.149959087371826, 1583493321.5959847],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 256, 14, 14], "int8"],
["TENSOR", [512, 256, 3, 3], "int8"], [2, 2], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 618454, "code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 16, 1]], ["tile_y",
"sp", [-1, 1, 1, 7]], ["tile_x", "sp", [-1, 1, 7, 1]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 512], ["unroll_explicit",
"ot", 1]]}, "result": [[0.00011399263832785345], 0, 2.7174792289733887, 1583495309.8544536],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 256, 14, 14], "int8"],
["TENSOR", [512, 256, 1, 1], "int8"], [2, 2], [0, 0, 0, 0], [1, 1], "int32"], {}], "config":
{"index": 142595, "code_hash": null, "entity": [["tile_f", "sp", [-1, 2, 16, 1]], ["tile_y",
"sp", [-1, 1, 1, 1]], ["tile_x", "sp", [-1, 1, 7, 1]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["auto_unroll_max_step", "ot", 512], ["unroll_explicit",
"ot", 1]]}, "result": [[2.1068503167097867e-05], 0, 1.9730663299560547, 1583497423.7057633],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw_winograd.cuda", [["TENSOR", [1, 256, 14,
14], "int8"], ["TENSOR", [256, 256, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"],
{}], "config": {"index": 38688, "code_hash": null, "entity": [["tile_b", "sp", [-1, 1, 1,
1]], ["tile_y", "sp", [-1, 1, 64, 2]], ["tile_x", "sp", [-1, 7, 7, 1]], ["tile_rc", "sp",
[-1, 32]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit", "ot", 0]]}, "result":
[[8.167179043743642e-05], 0, 2.713115692138672, 1583499212.5806458], "version": 0.2, "tvm_version":
"0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 256, 14, 14], "int8"],
["TENSOR", [256, 256, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 8973055, "code_hash": null, "entity": [["tile_f", "sp", [-1, 2, 8, 1]], ["tile_y",
"sp", [-1, 1, 2, 7]], ["tile_x", "sp", [-1, 1, 7, 1]], ["tile_rc", "sp", [-1, 32]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit",
"ot", 1]]}, "result": [[0.00010543543769129865], 0, 2.6481783390045166, 1583501043.3432517],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 128, 28, 28], "int8"],
["TENSOR", [256, 128, 3, 3], "int8"], [2, 2], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 8001541, "code_hash": null, "entity": [["tile_f", "sp", [-1, 2, 16, 1]], ["tile_y",
"sp", [-1, 1, 2, 7]], ["tile_x", "sp", [-1, 1, 7, 1]], ["tile_rc", "sp", [-1, 32]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit",
"ot", 1]]}, "result": [[7.353533925290876e-05], 0, 2.8866312503814697, 1583503869.7059126],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 128, 28, 28], "int8"],
["TENSOR", [256, 128, 1, 1], "int8"], [2, 2], [0, 0, 0, 0], [1, 1], "int32"], {}], "config":
{"index": 1584197, "code_hash": null, "entity": [["tile_f", "sp", [-1, 4, 16, 1]], ["tile_y",
"sp", [-1, 2, 1, 1]], ["tile_x", "sp", [-1, 1, 14, 1]], ["tile_rc", "sp", [-1, 32]], ["tile_ry",
"sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["auto_unroll_max_step", "ot", 512], ["unroll_explicit",
"ot", 1]]}, "result": [[1.4255146899458157e-05], 0, 2.4680140018463135, 1583507510.7302575],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw_winograd.cuda", [["TENSOR", [1, 128, 28,
28], "int8"], ["TENSOR", [128, 128, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"],
{}], "config": {"index": 543511, "code_hash": null, "entity": [["tile_b", "sp", [-1, 1, 1,
1]], ["tile_y", "sp", [-1, 2, 32, 1]], ["tile_x", "sp", [-1, 7, 28, 1]], ["tile_rc", "sp",
[-1, 32]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit", "ot", 1]]}, "result":
[[8.007428490878938e-05], 0, 2.5169312953948975, 1583511425.53789], "version": 0.2, "tvm_version":
"0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 128, 28, 28], "int8"],
["TENSOR", [128, 128, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 36148587, "code_hash": null, "entity": [["tile_f", "sp", [-1, 2, 16, 1]], ["tile_y",
"sp", [-1, 1, 2, 14]], ["tile_x", "sp", [-1, 1, 4, 1]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit",
"ot", 1]]}, "result": [[6.772174803149606e-05], 0, 2.3003950119018555, 1583516359.102587],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 64, 56, 56], "int8"],
["TENSOR", [128, 64, 3, 3], "int8"], [2, 2], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 31741619, "code_hash": null, "entity": [["tile_f", "sp", [-1, 2, 16, 2]], ["tile_y",
"sp", [-1, 1, 2, 7]], ["tile_x", "sp", [-1, 1, 7, 1]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit",
"ot", 1]]}, "result": [[5.370661902625084e-05], 0, 3.261442184448242, 1583518624.2030337],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 64, 56, 56], "int8"],
["TENSOR", [128, 64, 1, 1], "int8"], [2, 2], [0, 0, 0, 0], [1, 1], "int32"], {}], "config":
{"index": 2195842, "code_hash": null, "entity": [["tile_f", "sp", [-1, 1, 16, 4]], ["tile_y",
"sp", [-1, 1, 1, 2]], ["tile_x", "sp", [-1, 1, 28, 1]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 1]], ["tile_rx", "sp", [-1, 1]], ["auto_unroll_max_step", "ot", 512], ["unroll_explicit",
"ot", 0]]}, "result": [[1.1885200473884825e-05], 0, 2.094285726547241, 1583521724.6021314],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw_winograd.cuda", [["TENSOR", [1, 64, 56,
56], "int8"], ["TENSOR", [64, 64, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"],
{}], "config": {"index": 94814, "code_hash": null, "entity": [["tile_b", "sp", [-1, 1, 1,
1]], ["tile_y", "sp", [-1, 2, 8, 4]], ["tile_x", "sp", [-1, 1, 28, 1]], ["tile_rc", "sp",
[-1, 16]], ["auto_unroll_max_step", "ot", 128], ["unroll_explicit", "ot", 0]]}, "result":
[[6.607622816593886e-05], 0, 3.667783737182617, 1583526061.1939218], "version": 0.2, "tvm_version":
"0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 64, 56, 56], "int8"],
["TENSOR", [64, 64, 3, 3], "int8"], [1, 1], [1, 1, 1, 1], [1, 1], "int32"], {}], "config":
{"index": 88977971, "code_hash": null, "entity": [["tile_f", "sp", [-1, 2, 16, 2]], ["tile_y",
"sp", [-1, 1, 2, 7]], ["tile_x", "sp", [-1, 1, 4, 2]], ["tile_rc", "sp", [-1, 16]], ["tile_ry",
"sp", [-1, 3]], ["tile_rx", "sp", [-1, 3]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit",
"ot", 1]]}, "result": [[4.877310412853366e-05], 0, 2.9656577110290527, 1583528602.3562064],
"version": 0.2, "tvm_version": "0.7.dev1"}
   {"input": ["cuda -model=unknown", "conv2d_nchw.cuda", [["TENSOR", [1, 3, 224, 224], "float32"],
["TENSOR", [64, 3, 7, 7], "float32"], [2, 2], [3, 3, 3, 3], [1, 1], "float32"], {}], "config":
{"index": 36609153, "code_hash": null, "entity": [["tile_f", "sp", [-1, 8, 8, 1]], ["tile_y",
"sp", [-1, 7, 1, 1]], ["tile_x", "sp", [-1, 1, 14, 1]], ["tile_rc", "sp", [-1, 1]], ["tile_ry",
"sp", [-1, 7]], ["tile_rx", "sp", [-1, 7]], ["auto_unroll_max_step", "ot", 1500], ["unroll_explicit",
"ot", 0]]}, "result": [[7.900877518104015e-05], 0, 2.1976771354675293, 1583533339.5779808],
"version": 0.2, "tvm_version": "0.7.dev1"}
   
   ```
   
   However, the accuracy is still close to zero. Like this:
   ```
   Top1 Acc: 0.0026109660574412533, 1/383
   Top5 Acc: 0.010443864229765013, 4/383
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message