From commits-return-10342-archive-asf-public=cust-asf.ponee.io@tvm.apache.org Fri Apr 3 09:28:43 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D0CE218064F for ; Fri, 3 Apr 2020 11:28:42 +0200 (CEST) Received: (qmail 57936 invoked by uid 500); 3 Apr 2020 09:28:42 -0000 Mailing-List: contact commits-help@tvm.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tvm.apache.org Delivered-To: mailing list commits@tvm.apache.org Received: (qmail 57917 invoked by uid 99); 3 Apr 2020 09:28:42 -0000 Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 03 Apr 2020 09:28:42 +0000 From: GitBox To: commits@tvm.apache.org Subject: [GitHub] [incubator-tvm] huochaitiantang opened a new pull request #5226: [CODEGEN][CUDA] Fix vector load Message-ID: Date: Fri, 03 Apr 2020 09:28:42 -0000 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit huochaitiantang opened a new pull request #5226: [CODEGEN][CUDA] Fix vector load URL: https://github.com/apache/incubator-tvm/pull/5226 * Fix high-low bit bug in __pack_half2. * Do not emit code of vector load by introducing an extra statement and vector store: ``` int _1; int4 _2 = (make_int4)( ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*0), ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*1), ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*2), ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*3)); _1=(((signed char*)A)[_2.x] << 0); _1=_1 & ~(0x000000ff << 8) |(((signed char*)A)[_2.y] << 8); _1=_1 & ~(0x000000ff << 16) |(((signed char*)A)[_2.z] << 16); _1=_1 & ~(0x000000ff << 24) |(((signed char*)A)[_2.w] << 24); (( int*)(( signed char*)B + (((((int)blockIdx.x) * 88) + (((int)threadIdx.x) * 4)))))[0] = (((((int)threadIdx.x) < 3) || (19 <= ((int)threadIdx.x))) ? (int)0 : _1); ``` The above code is a padding kernel. Whether `_2.x`, `_2.y`, `_2.z`, `_2.w` are the correct indexes of `A` or not, the introduced variable `_1` will be calculated. So emit the following code instead: ``` int4 _1 = (make_int4)( ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*0), ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*1), ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*2), ((((((int)blockIdx.x) * 64) + ((int)threadIdx.x)) - 3))+(16*3)); (( int*)(( signed char*)B + (((((int)blockIdx.x) * 88) + (((int)threadIdx.x) * 4)))))[0] = (((((int)threadIdx.x) < 3) || (19 <= ((int)threadIdx.x))) ? (int)0 : ((0x000000ff << 0) & (((signed char*)A)[_1.x] << 0))| ((0x000000ff << 8) & (((signed char*)A)[_1.y] << 8))| ((0x000000ff << 16) & (((signed char*)A)[_1.z] << 16))| ((0x000000ff << 24) & (((signed char*)A)[_1.w] << 24))); ``` @vinx13, could you please help review? Thanks! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org With regards, Apache Git Services