From commits-return-31743-archive-asf-public=cust-asf.ponee.io@hive.apache.org Fri Feb 16 16:52:20 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 990BA1807AC for ; Fri, 16 Feb 2018 16:52:19 +0100 (CET) Received: (qmail 31654 invoked by uid 500); 16 Feb 2018 15:52:16 -0000 Mailing-List: contact commits-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hive.apache.org Delivered-To: mailing list commits@hive.apache.org Received: (qmail 31230 invoked by uid 99); 16 Feb 2018 15:52:15 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 16 Feb 2018 15:52:15 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 5FB3CF17F8; Fri, 16 Feb 2018 15:52:15 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: mmccline@apache.org To: commits@hive.apache.org Date: Fri, 16 Feb 2018 15:52:31 -0000 Message-Id: In-Reply-To: <6e59fe8488f04b3984d2c68fc19487b2@git.apache.org> References: <6e59fe8488f04b3984d2c68fc19487b2@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [17/32] hive git commit: HIVE-18622: Vectorization: IF Statements, Comparisons, and more do not handle NULLs correctly (Matt McCline, reviewed by Sergey Shelukhin, Deepak Jaiswal, Vihang Karajgaonkar) http://git-wip-us.apache.org/repos/asf/hive/blob/a4689020/ql/src/test/results/clientpositive/llap/vector_decimal_aggregate.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vector_decimal_aggregate.q.out b/ql/src/test/results/clientpositive/llap/vector_decimal_aggregate.q.out index 32e2088..0a72b3f 100644 --- a/ql/src/test/results/clientpositive/llap/vector_decimal_aggregate.q.out +++ b/ql/src/test/results/clientpositive/llap/vector_decimal_aggregate.q.out @@ -20,6 +20,18 @@ POSTHOOK: Lineage: decimal_vgby.cdecimal1 EXPRESSION [(alltypesorc)alltypesorc.F POSTHOOK: Lineage: decimal_vgby.cdecimal2 EXPRESSION [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_vgby.cdouble SIMPLE [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_vgby.cint SIMPLE [(alltypesorc)alltypesorc.FieldSchema(name:cint, type:int, comment:null), ] +PREHOOK: query: insert into decimal_vgby values (NULL, NULL, NULL, NULL) +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@decimal_vgby +POSTHOOK: query: insert into decimal_vgby values (NULL, NULL, NULL, NULL) +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@decimal_vgby +POSTHOOK: Lineage: decimal_vgby.cdecimal1 EXPRESSION [] +POSTHOOK: Lineage: decimal_vgby.cdecimal2 EXPRESSION [] +POSTHOOK: Lineage: decimal_vgby.cdouble EXPRESSION [] +POSTHOOK: Lineage: decimal_vgby.cint EXPRESSION [] PREHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT cint, COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), @@ -56,7 +68,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: decimal_vgby - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE TableScan Vectorization: native: true vectorizationSchemaColumns: [0:cdouble:double, 1:cdecimal1:decimal(20,10), 2:cdecimal2:decimal(23,14), 3:cint:int, 4:ROW__ID:struct] @@ -67,7 +79,7 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [1, 2, 3] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(cdecimal1), max(cdecimal1), min(cdecimal1), sum(cdecimal1), count(cdecimal2), max(cdecimal2), min(cdecimal2), sum(cdecimal2), count() Group By Vectorization: @@ -81,7 +93,7 @@ STAGE PLANS: keys: cint (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + @@ -92,7 +104,7 @@ STAGE PLANS: native: true nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true valueColumnNums: [1, 2, 3, 4, 5, 6, 7, 8, 9] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint), _col2 (type: decimal(20,10)), _col3 (type: decimal(20,10)), _col4 (type: decimal(30,10)), _col5 (type: bigint), _col6 (type: decimal(23,14)), _col7 (type: decimal(23,14)), _col8 (type: decimal(33,14)), _col9 (type: bigint) Execution mode: vectorized, llap LLAP IO: all inputs @@ -140,14 +152,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 - Statistics: Num rows: 6144 Data size: 1330950 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 6144 Data size: 1330955 Basic stats: COMPLETE Column stats: NONE Filter Operator Filter Vectorization: className: VectorFilterOperator native: true predicateExpression: FilterLongColGreaterLongScalar(col 9:bigint, val 1) predicate: (_col9 > 1) (type: boolean) - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: decimal(20,10)), _col3 (type: decimal(20,10)), _col4 (type: decimal(30,10)), _col5 (type: bigint), _col6 (type: decimal(23,14)), _col7 (type: decimal(23,14)), _col8 (type: decimal(33,14)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8 @@ -155,13 +167,13 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [0, 1, 2, 3, 4, 5, 6, 7, 8] - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false File Sink Vectorization: className: VectorFileSinkOperator native: false - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -235,7 +247,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: decimal_vgby - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE TableScan Vectorization: native: true vectorizationSchemaColumns: [0:cdouble:double, 1:cdecimal1:decimal(20,10), 2:cdecimal2:decimal(23,14), 3:cint:int, 4:ROW__ID:struct] @@ -246,7 +258,7 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [1, 2, 3] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(cdecimal1), max(cdecimal1), min(cdecimal1), sum(cdecimal1), avg(cdecimal1), stddev_pop(cdecimal1), stddev_samp(cdecimal1), count(cdecimal2), max(cdecimal2), min(cdecimal2), sum(cdecimal2), avg(cdecimal2), stddev_pop(cdecimal2), stddev_samp(cdecimal2), count() Group By Vectorization: @@ -260,7 +272,7 @@ STAGE PLANS: keys: cint (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + @@ -271,7 +283,7 @@ STAGE PLANS: native: true nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true valueColumnNums: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint), _col2 (type: decimal(20,10)), _col3 (type: decimal(20,10)), _col4 (type: decimal(30,10)), _col5 (type: struct), _col6 (type: struct), _col7 (type: struct), _col8 (type: bigint), _col9 (type: decimal(23,14)), _col10 (type: decimal(23,14)), _col11 (type: decimal(33,14)), _col12 (type: struct), _col13 (type: struct), _col14 (type: struct), _col15 (type: bigint) Execution mode: vectorized, llap LLAP IO: all inputs @@ -319,14 +331,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 - Statistics: Num rows: 6144 Data size: 1330950 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 6144 Data size: 1330955 Basic stats: COMPLETE Column stats: NONE Filter Operator Filter Vectorization: className: VectorFilterOperator native: true predicateExpression: FilterLongColGreaterLongScalar(col 15:bigint, val 1) predicate: (_col15 > 1) (type: boolean) - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: decimal(20,10)), _col3 (type: decimal(20,10)), _col4 (type: decimal(30,10)), _col5 (type: decimal(24,14)), _col6 (type: double), _col7 (type: double), _col8 (type: bigint), _col9 (type: decimal(23,14)), _col10 (type: decimal(23,14)), _col11 (type: decimal(33,14)), _col12 (type: decimal(27,18)), _col13 (type: double), _col14 (type: double) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 @@ -334,13 +346,13 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false File Sink Vectorization: className: VectorFileSinkOperator native: false - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -400,6 +412,18 @@ POSTHOOK: Lineage: decimal_vgby_small.cdecimal1 EXPRESSION [(alltypesorc)alltype POSTHOOK: Lineage: decimal_vgby_small.cdecimal2 EXPRESSION [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_vgby_small.cdouble SIMPLE [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_vgby_small.cint SIMPLE [(alltypesorc)alltypesorc.FieldSchema(name:cint, type:int, comment:null), ] +PREHOOK: query: insert into decimal_vgby_small values (NULL, NULL, NULL, NULL) +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@decimal_vgby_small +POSTHOOK: query: insert into decimal_vgby_small values (NULL, NULL, NULL, NULL) +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@decimal_vgby_small +POSTHOOK: Lineage: decimal_vgby_small.cdecimal1 EXPRESSION [] +POSTHOOK: Lineage: decimal_vgby_small.cdecimal2 EXPRESSION [] +POSTHOOK: Lineage: decimal_vgby_small.cdouble EXPRESSION [] +POSTHOOK: Lineage: decimal_vgby_small.cint EXPRESSION [] PREHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT cint, COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), @@ -436,7 +460,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: decimal_vgby_small - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE TableScan Vectorization: native: true vectorizationSchemaColumns: [0:cdouble:double, 1:cdecimal1:decimal(11,5), 2:cdecimal2:decimal(16,0), 3:cint:int, 4:ROW__ID:struct] @@ -447,7 +471,7 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [1, 2, 3] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(cdecimal1), max(cdecimal1), min(cdecimal1), sum(cdecimal1), count(cdecimal2), max(cdecimal2), min(cdecimal2), sum(cdecimal2), count() Group By Vectorization: @@ -461,7 +485,7 @@ STAGE PLANS: keys: cint (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + @@ -472,7 +496,7 @@ STAGE PLANS: native: true nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true valueColumnNums: [1, 2, 3, 4, 5, 6, 7, 8, 9] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint), _col2 (type: decimal(11,5)), _col3 (type: decimal(11,5)), _col4 (type: decimal(21,5)), _col5 (type: bigint), _col6 (type: decimal(16,0)), _col7 (type: decimal(16,0)), _col8 (type: decimal(26,0)), _col9 (type: bigint) Execution mode: vectorized, llap LLAP IO: no inputs @@ -521,14 +545,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9 - Statistics: Num rows: 6144 Data size: 1330950 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 6144 Data size: 1330955 Basic stats: COMPLETE Column stats: NONE Filter Operator Filter Vectorization: className: VectorFilterOperator native: true predicateExpression: FilterLongColGreaterLongScalar(col 9:bigint, val 1) predicate: (_col9 > 1) (type: boolean) - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: decimal(11,5)), _col3 (type: decimal(11,5)), _col4 (type: decimal(21,5)), _col5 (type: bigint), _col6 (type: decimal(16,0)), _col7 (type: decimal(16,0)), _col8 (type: decimal(26,0)) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8 @@ -536,13 +560,13 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [0, 1, 2, 3, 4, 5, 6, 7, 8] - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false File Sink Vectorization: className: VectorFileSinkOperator native: false - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -580,6 +604,25 @@ POSTHOOK: Input: default@decimal_vgby_small 6981 2 -515.62107 -515.62107 -1031.24214 3 6984454 -618 6983218 762 1 1531.21941 1531.21941 1531.21941 2 6984454 1834 6986288 NULL 3072 9318.43514 -4298.15135 5018444.11392 3072 11161 -5148 6010880 +PREHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cint, + COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), + COUNT(cdecimal2), MAX(cdecimal2), MIN(cdecimal2), SUM(cdecimal2) + FROM decimal_vgby_small + GROUP BY cint) q +PREHOOK: type: QUERY +PREHOOK: Input: default@decimal_vgby_small +#### A masked pattern was here #### +POSTHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cint, + COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), + COUNT(cdecimal2), MAX(cdecimal2), MIN(cdecimal2), SUM(cdecimal2) + FROM decimal_vgby_small + GROUP BY cint) q +POSTHOOK: type: QUERY +POSTHOOK: Input: default@decimal_vgby_small +#### A masked pattern was here #### +-18663521580 PREHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT cint, COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), AVG(cdecimal1), STDDEV_POP(cdecimal1), STDDEV_SAMP(cdecimal1), @@ -616,7 +659,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: decimal_vgby_small - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE TableScan Vectorization: native: true vectorizationSchemaColumns: [0:cdouble:double, 1:cdecimal1:decimal(11,5), 2:cdecimal2:decimal(16,0), 3:cint:int, 4:ROW__ID:struct] @@ -627,7 +670,7 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [1, 2, 3] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count(cdecimal1), max(cdecimal1), min(cdecimal1), sum(cdecimal1), avg(cdecimal1), stddev_pop(cdecimal1), stddev_samp(cdecimal1), count(cdecimal2), max(cdecimal2), min(cdecimal2), sum(cdecimal2), avg(cdecimal2), stddev_pop(cdecimal2), stddev_samp(cdecimal2), count() Group By Vectorization: @@ -641,7 +684,7 @@ STAGE PLANS: keys: cint (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: int) sort order: + @@ -652,7 +695,7 @@ STAGE PLANS: native: true nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true valueColumnNums: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15] - Statistics: Num rows: 12288 Data size: 2661900 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2662128 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint), _col2 (type: decimal(11,5)), _col3 (type: decimal(11,5)), _col4 (type: decimal(21,5)), _col5 (type: struct), _col6 (type: struct), _col7 (type: struct), _col8 (type: bigint), _col9 (type: decimal(16,0)), _col10 (type: decimal(16,0)), _col11 (type: decimal(26,0)), _col12 (type: struct), _col13 (type: struct), _col14 (type: struct), _col15 (type: bigint) Execution mode: vectorized, llap LLAP IO: no inputs @@ -701,14 +744,14 @@ STAGE PLANS: keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 - Statistics: Num rows: 6144 Data size: 1330950 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 6144 Data size: 1330955 Basic stats: COMPLETE Column stats: NONE Filter Operator Filter Vectorization: className: VectorFilterOperator native: true predicateExpression: FilterLongColGreaterLongScalar(col 15:bigint, val 1) predicate: (_col15 > 1) (type: boolean) - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: decimal(11,5)), _col3 (type: decimal(11,5)), _col4 (type: decimal(21,5)), _col5 (type: decimal(15,9)), _col6 (type: double), _col7 (type: double), _col8 (type: bigint), _col9 (type: decimal(16,0)), _col10 (type: decimal(16,0)), _col11 (type: decimal(26,0)), _col12 (type: decimal(20,4)), _col13 (type: double), _col14 (type: double) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 @@ -716,13 +759,13 @@ STAGE PLANS: className: VectorSelectOperator native: true projectedOutputColumnNums: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false File Sink Vectorization: className: VectorFileSinkOperator native: false - Statistics: Num rows: 2048 Data size: 443650 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 2048 Data size: 443651 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat @@ -760,3 +803,22 @@ POSTHOOK: Input: default@decimal_vgby_small 6981 2 -515.62107 -515.62107 -1031.24214 -515.621070000 0.0 0.0 3 6984454 -618 6983218 2327739.3333 3292794.518850853 4032833.1995089175 762 1 1531.21941 1531.21941 1531.21941 1531.219410000 0.0 NULL 2 6984454 1834 6986288 3493144.0000 3491310.0 4937457.95244881 NULL 3072 9318.43514 -4298.15135 5018444.11392 1633.608110000 5695.483083909642 5696.410309489072 3072 11161 -5148 6010880 1956.6667 6821.647911041892 6822.758476439734 +PREHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cint, + COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), AVG(cdecimal1), STDDEV_POP(cdecimal1), STDDEV_SAMP(cdecimal1), + COUNT(cdecimal2), MAX(cdecimal2), MIN(cdecimal2), SUM(cdecimal2), AVG(cdecimal2), STDDEV_POP(cdecimal2), STDDEV_SAMP(cdecimal2) + FROM decimal_vgby_small + GROUP BY cint) q +PREHOOK: type: QUERY +PREHOOK: Input: default@decimal_vgby_small +#### A masked pattern was here #### +POSTHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cint, + COUNT(cdecimal1), MAX(cdecimal1), MIN(cdecimal1), SUM(cdecimal1), AVG(cdecimal1), STDDEV_POP(cdecimal1), STDDEV_SAMP(cdecimal1), + COUNT(cdecimal2), MAX(cdecimal2), MIN(cdecimal2), SUM(cdecimal2), AVG(cdecimal2), STDDEV_POP(cdecimal2), STDDEV_SAMP(cdecimal2) + FROM decimal_vgby_small + GROUP BY cint) q +POSTHOOK: type: QUERY +POSTHOOK: Input: default@decimal_vgby_small +#### A masked pattern was here #### +91757235680 http://git-wip-us.apache.org/repos/asf/hive/blob/a4689020/ql/src/test/results/clientpositive/llap/vector_decimal_expressions.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vector_decimal_expressions.q.out b/ql/src/test/results/clientpositive/llap/vector_decimal_expressions.q.out index d63eeb7..7dbe584 100644 --- a/ql/src/test/results/clientpositive/llap/vector_decimal_expressions.q.out +++ b/ql/src/test/results/clientpositive/llap/vector_decimal_expressions.q.out @@ -1,13 +1,30 @@ -PREHOOK: query: CREATE TABLE decimal_test STORED AS ORC AS SELECT cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(20,10)) AS cdecimal1, CAST (((cdouble*9.3)/13) AS DECIMAL(23,14)) AS cdecimal2 FROM alltypesorc -PREHOOK: type: CREATETABLE_AS_SELECT -PREHOOK: Input: default@alltypesorc +PREHOOK: query: CREATE TABLE decimal_test (cdouble double,cdecimal1 DECIMAL(20,10), cdecimal2 DECIMAL(23,14)) STORED AS ORC +PREHOOK: type: CREATETABLE PREHOOK: Output: database:default PREHOOK: Output: default@decimal_test -POSTHOOK: query: CREATE TABLE decimal_test STORED AS ORC AS SELECT cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(20,10)) AS cdecimal1, CAST (((cdouble*9.3)/13) AS DECIMAL(23,14)) AS cdecimal2 FROM alltypesorc -POSTHOOK: type: CREATETABLE_AS_SELECT -POSTHOOK: Input: default@alltypesorc +POSTHOOK: query: CREATE TABLE decimal_test (cdouble double,cdecimal1 DECIMAL(20,10), cdecimal2 DECIMAL(23,14)) STORED AS ORC +POSTHOOK: type: CREATETABLE POSTHOOK: Output: database:default POSTHOOK: Output: default@decimal_test +PREHOOK: query: insert into decimal_test values (NULL, NULL, NULL) +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@decimal_test +POSTHOOK: query: insert into decimal_test values (NULL, NULL, NULL) +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@decimal_test +POSTHOOK: Lineage: decimal_test.cdecimal1 EXPRESSION [] +POSTHOOK: Lineage: decimal_test.cdecimal2 EXPRESSION [] +POSTHOOK: Lineage: decimal_test.cdouble EXPRESSION [] +PREHOOK: query: INSERT INTO TABLE decimal_test SELECT cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(20,10)) AS cdecimal1, CAST (((cdouble*9.3)/13) AS DECIMAL(23,14)) AS cdecimal2 FROM alltypesorc +PREHOOK: type: QUERY +PREHOOK: Input: default@alltypesorc +PREHOOK: Output: default@decimal_test +POSTHOOK: query: INSERT INTO TABLE decimal_test SELECT cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(20,10)) AS cdecimal1, CAST (((cdouble*9.3)/13) AS DECIMAL(23,14)) AS cdecimal2 FROM alltypesorc +POSTHOOK: type: QUERY +POSTHOOK: Input: default@alltypesorc +POSTHOOK: Output: default@decimal_test POSTHOOK: Lineage: decimal_test.cdecimal1 EXPRESSION [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_test.cdecimal2 EXPRESSION [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_test.cdouble SIMPLE [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] @@ -41,7 +58,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: decimal_test - Statistics: Num rows: 12288 Data size: 2708600 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 2708832 Basic stats: COMPLETE Column stats: NONE TableScan Vectorization: native: true vectorizationSchemaColumns: [0:cdouble:double, 1:cdecimal1:decimal(20,10), 2:cdecimal2:decimal(23,14), 3:ROW__ID:struct] @@ -158,6 +175,19 @@ POSTHOOK: Input: default@decimal_test 1895.51268191268460 -1203.53347193346920 0.8371969190171 262050.87567567649292835 2.4972972973 862 1033 NULL 862 true 1033.0153846153846 862.4973 1033.0153846153846 1969-12-31 16:14:22.497297297 1909.95218295221550 -1212.70166320163100 0.8371797936946 266058.54729730725574014 9.0675675676 869 1040 NULL 869 true 1040.8846153846155 869.06757 1040.8846153846155 1969-12-31 16:14:29.067567567 1913.89022869026920 -1215.20207900203840 0.8371751679996 267156.82702703945592392 0.8594594595 870 1043 NULL 870 true 1043.0307692307692 870.85944 1043.0307692307692 1969-12-31 16:14:30.859459459 +PREHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cdecimal1 + cdecimal2 as c1, cdecimal1 - (2*cdecimal2) as c2, ((cdecimal1+2.34)/cdecimal2) as c3, (cdecimal1 * (cdecimal2/3.4)) as c4, cdecimal1 % 10 as c5, CAST(cdecimal1 AS INT) as c6, CAST(cdecimal2 AS SMALLINT) as c7, CAST(cdecimal2 AS TINYINT) as c8, CAST(cdecimal1 AS BIGINT) as c9, CAST (cdecimal1 AS BOOLEAN) as c10, CAST(cdecimal2 AS DOUBLE) as c11, CAST(cdecimal1 AS FLOAT) as c12, CAST(cdecimal2 AS STRING) as c13, CAST(cdecimal1 AS TIMESTAMP) as c14 FROM decimal_test WHERE cdecimal1 > 0 AND cdecimal1 < 12345.5678 AND cdecimal2 != 0 AND cdecimal2 > 1000 AND cdouble IS NOT NULL +ORDER BY c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14) q +PREHOOK: type: QUERY +PREHOOK: Input: default@decimal_test +#### A masked pattern was here #### +POSTHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cdecimal1 + cdecimal2 as c1, cdecimal1 - (2*cdecimal2) as c2, ((cdecimal1+2.34)/cdecimal2) as c3, (cdecimal1 * (cdecimal2/3.4)) as c4, cdecimal1 % 10 as c5, CAST(cdecimal1 AS INT) as c6, CAST(cdecimal2 AS SMALLINT) as c7, CAST(cdecimal2 AS TINYINT) as c8, CAST(cdecimal1 AS BIGINT) as c9, CAST (cdecimal1 AS BOOLEAN) as c10, CAST(cdecimal2 AS DOUBLE) as c11, CAST(cdecimal1 AS FLOAT) as c12, CAST(cdecimal2 AS STRING) as c13, CAST(cdecimal1 AS TIMESTAMP) as c14 FROM decimal_test WHERE cdecimal1 > 0 AND cdecimal1 < 12345.5678 AND cdecimal2 != 0 AND cdecimal2 > 1000 AND cdouble IS NOT NULL +ORDER BY c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14) q +POSTHOOK: type: QUERY +POSTHOOK: Input: default@decimal_test +#### A masked pattern was here #### +-1300490595129 PREHOOK: query: CREATE TABLE decimal_test_small STORED AS ORC AS SELECT cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(10,3)) AS cdecimal1, CAST (((cdouble*9.3)/13) AS DECIMAL(7,2)) AS cdecimal2 FROM alltypesorc PREHOOK: type: CREATETABLE_AS_SELECT PREHOOK: Input: default@alltypesorc @@ -318,3 +348,16 @@ POSTHOOK: Input: default@decimal_test_small 1895.517 -1203.543 0.83719289075 262051.956361764 2.497 862 1033 NULL 862 true 1033.02 862.497 1033.02 1969-12-31 16:14:22.497 1909.948 -1212.692 0.83718392130 266057.499543968 9.068 869 1040 NULL 869 true 1040.88 869.068 1040.88 1969-12-31 16:14:29.068 1913.889 -1215.201 0.83717534491 267156.488691411 0.859 870 1043 NULL 870 true 1043.03 870.859 1043.03 1969-12-31 16:14:30.859 +PREHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cdecimal1 + cdecimal2 as c1, cdecimal1 - (2*cdecimal2) as c2, ((cdecimal1+2.34)/cdecimal2) as c3, (cdecimal1 * (cdecimal2/3.4)) as c4, cdecimal1 % 10 as c5, CAST(cdecimal1 AS INT) as c6, CAST(cdecimal2 AS SMALLINT) as c7, CAST(cdecimal2 AS TINYINT) as c8, CAST(cdecimal1 AS BIGINT) as c9, CAST (cdecimal1 AS BOOLEAN) as c10, CAST(cdecimal2 AS DOUBLE) as c11, CAST(cdecimal1 AS FLOAT) as c12, CAST(cdecimal2 AS STRING) as c13, CAST(cdecimal1 AS TIMESTAMP) as c14 FROM decimal_test_small WHERE cdecimal1 > 0 AND cdecimal1 < 12345.5678 AND cdecimal2 != 0 AND cdecimal2 > 1000 AND cdouble IS NOT NULL +ORDER BY c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14) q +PREHOOK: type: QUERY +PREHOOK: Input: default@decimal_test_small +#### A masked pattern was here #### +POSTHOOK: query: SELECT SUM(HASH(*)) +FROM (SELECT cdecimal1 + cdecimal2 as c1, cdecimal1 - (2*cdecimal2) as c2, ((cdecimal1+2.34)/cdecimal2) as c3, (cdecimal1 * (cdecimal2/3.4)) as c4, cdecimal1 % 10 as c5, CAST(cdecimal1 AS INT) as c6, CAST(cdecimal2 AS SMALLINT) as c7, CAST(cdecimal2 AS TINYINT) as c8, CAST(cdecimal1 AS BIGINT) as c9, CAST (cdecimal1 AS BOOLEAN) as c10, CAST(cdecimal2 AS DOUBLE) as c11, CAST(cdecimal1 AS FLOAT) as c12, CAST(cdecimal2 AS STRING) as c13, CAST(cdecimal1 AS TIMESTAMP) as c14 FROM decimal_test_small WHERE cdecimal1 > 0 AND cdecimal1 < 12345.5678 AND cdecimal2 != 0 AND cdecimal2 > 1000 AND cdouble IS NOT NULL +ORDER BY c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, c11, c12, c13, c14) q +POSTHOOK: type: QUERY +POSTHOOK: Input: default@decimal_test_small +#### A masked pattern was here #### +774841630076 http://git-wip-us.apache.org/repos/asf/hive/blob/a4689020/ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out b/ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out index 270b634..e9023a4 100644 --- a/ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out +++ b/ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out @@ -12,6 +12,18 @@ POSTHOOK: Lineage: decimal_test.cbigint SIMPLE [(alltypesorc)alltypesorc.FieldSc POSTHOOK: Lineage: decimal_test.cdecimal1 EXPRESSION [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_test.cdecimal2 EXPRESSION [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] POSTHOOK: Lineage: decimal_test.cdouble SIMPLE [(alltypesorc)alltypesorc.FieldSchema(name:cdouble, type:double, comment:null), ] +PREHOOK: query: insert into decimal_test values (NULL, NULL, NULL, NULL) +PREHOOK: type: QUERY +PREHOOK: Input: _dummy_database@_dummy_table +PREHOOK: Output: default@decimal_test +POSTHOOK: query: insert into decimal_test values (NULL, NULL, NULL, NULL) +POSTHOOK: type: QUERY +POSTHOOK: Input: _dummy_database@_dummy_table +POSTHOOK: Output: default@decimal_test +POSTHOOK: Lineage: decimal_test.cbigint EXPRESSION [] +POSTHOOK: Lineage: decimal_test.cdecimal1 EXPRESSION [] +POSTHOOK: Lineage: decimal_test.cdecimal2 EXPRESSION [] +POSTHOOK: Lineage: decimal_test.cdouble EXPRESSION [] PREHOOK: query: explain vectorization detail select cdecimal1 @@ -103,7 +115,7 @@ STAGE PLANS: Map Operator Tree: TableScan alias: decimal_test - Statistics: Num rows: 12288 Data size: 1401000 Basic stats: COMPLETE Column stats: NONE + Statistics: Num rows: 12289 Data size: 1401120 Basic stats: COMPLETE Column stats: NONE TableScan Vectorization: native: true vectorizationSchemaColumns: [0:cbigint:bigint, 1:cdouble:double, 2:cdecimal1:decimal(20,10), 3:cdecimal2:decimal(23,14), 4:ROW__ID:struct]