Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 27511200CBE for ; Fri, 23 Jun 2017 01:40:45 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 258EE160BF7; Thu, 22 Jun 2017 23:40:45 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 8FC29160BF9 for ; Fri, 23 Jun 2017 01:40:42 +0200 (CEST) Received: (qmail 95689 invoked by uid 500); 22 Jun 2017 23:40:41 -0000 Mailing-List: contact commits-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hive.apache.org Delivered-To: mailing list commits@hive.apache.org Received: (qmail 95380 invoked by uid 99); 22 Jun 2017 23:40:41 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Jun 2017 23:40:41 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 6816DE96B4; Thu, 22 Jun 2017 23:40:37 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: mmccline@apache.org To: commits@hive.apache.org Date: Thu, 22 Jun 2017 23:40:49 -0000 Message-Id: <39c892c4911f446caf84ae581ecc8db0@git.apache.org> In-Reply-To: <4007d28c78354ba28baf6a15d9ddb7d7@git.apache.org> References: <4007d28c78354ba28baf6a15d9ddb7d7@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [13/34] hive git commit: HIVE-16589: Vectorization: Support Complex Types and GroupBy modes PARTIAL2, FINAL, and COMPLETE for AVG, VARIANCE (Matt McCline, reviewed by Jason Dere) archived-at: Thu, 22 Jun 2017 23:40:45 -0000 http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/llap/vectorized_timestamp.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vectorized_timestamp.q.out b/ql/src/test/results/clientpositive/llap/vectorized_timestamp.q.out index 82d5518..24f8d36 100644 --- a/ql/src/test/results/clientpositive/llap/vectorized_timestamp.q.out +++ b/ql/src/test/results/clientpositive/llap/vectorized_timestamp.q.out @@ -17,24 +17,49 @@ POSTHOOK: query: INSERT INTO TABLE test VALUES ('0001-01-01 00:00:00.000000000') POSTHOOK: type: QUERY POSTHOOK: Output: default@test POSTHOOK: Lineage: test.ts EXPRESSION [(values__tmp__table__1)values__tmp__table__1.FieldSchema(name:tmp_values_col1, type:string, comment:), ] -PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION +PREHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT ts FROM test PREHOOK: type: QUERY -POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION +POSTHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT ts FROM test POSTHOOK: type: QUERY -Plan optimized by CBO. - -Stage-0 - Fetch Operator - limit:-1 - Stage-1 - Map 1 llap - File Output Operator [FS_2] - Select Operator [SEL_1] (rows=2 width=40) - Output:["_col0"] - TableScan [TS_0] (rows=2 width=40) - default@test,test,Tbl:COMPLETE,Col:NONE,Output:["ts"] +PLAN VECTORIZATION: + enabled: false + enabledConditionsNotMet: [hive.vectorized.execution.enabled IS false] + +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: test + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + Select Operator + expressions: ts (type: timestamp) + outputColumnNames: _col0 + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + File Output Operator + compressed: false + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + Execution mode: llap + LLAP IO: all inputs + + Stage: Stage-0 + Fetch Operator + limit: -1 + Processor Tree: + ListSink PREHOOK: query: SELECT ts FROM test PREHOOK: type: QUERY @@ -46,36 +71,6 @@ POSTHOOK: Input: default@test #### A masked pattern was here #### 0001-01-01 00:00:00 9999-12-31 23:59:59.999999999 -PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION -SELECT MIN(ts), MAX(ts), MAX(ts) - MIN(ts) FROM test -PREHOOK: type: QUERY -POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION -SELECT MIN(ts), MAX(ts), MAX(ts) - MIN(ts) FROM test -POSTHOOK: type: QUERY -Plan optimized by CBO. - -Vertex dependency in root stage -Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) - -Stage-0 - Fetch Operator - limit:-1 - Stage-1 - Reducer 2 llap - File Output Operator [FS_6] - Select Operator [SEL_5] (rows=1 width=80) - Output:["_col0","_col1","_col2"] - Group By Operator [GBY_4] (rows=1 width=80) - Output:["_col0","_col1"],aggregations:["min(VALUE._col0)","max(VALUE._col1)"] - <-Map 1 [CUSTOM_SIMPLE_EDGE] llap - PARTITION_ONLY_SHUFFLE [RS_3] - Group By Operator [GBY_2] (rows=1 width=80) - Output:["_col0","_col1"],aggregations:["min(ts)","max(ts)"] - Select Operator [SEL_1] (rows=2 width=40) - Output:["ts"] - TableScan [TS_0] (rows=2 width=40) - default@test,test,Tbl:COMPLETE,Col:NONE,Output:["ts"] - PREHOOK: query: SELECT MIN(ts), MAX(ts), MAX(ts) - MIN(ts) FROM test PREHOOK: type: QUERY PREHOOK: Input: default@test @@ -85,27 +80,6 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@test #### A masked pattern was here #### 0001-01-01 00:00:00 9999-12-31 23:59:59.999999999 3652060 23:59:59.999999999 -PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION -SELECT ts FROM test WHERE ts IN (timestamp '0001-01-01 00:00:00.000000000', timestamp '0002-02-02 00:00:00.000000000') -PREHOOK: type: QUERY -POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION -SELECT ts FROM test WHERE ts IN (timestamp '0001-01-01 00:00:00.000000000', timestamp '0002-02-02 00:00:00.000000000') -POSTHOOK: type: QUERY -Plan optimized by CBO. - -Stage-0 - Fetch Operator - limit:-1 - Stage-1 - Map 1 llap - File Output Operator [FS_3] - Select Operator [SEL_2] (rows=1 width=40) - Output:["_col0"] - Filter Operator [FIL_4] (rows=1 width=40) - predicate:(ts) IN (0001-01-01 00:00:00.0, 0002-02-02 00:00:00.0) - TableScan [TS_0] (rows=2 width=40) - default@test,test,Tbl:COMPLETE,Col:NONE,Output:["ts"] - PREHOOK: query: SELECT ts FROM test WHERE ts IN (timestamp '0001-01-01 00:00:00.000000000', timestamp '0002-02-02 00:00:00.000000000') PREHOOK: type: QUERY PREHOOK: Input: default@test @@ -115,25 +89,6 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@test #### A masked pattern was here #### 0001-01-01 00:00:00 -PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION -SELECT ts FROM test -PREHOOK: type: QUERY -POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION -SELECT ts FROM test -POSTHOOK: type: QUERY -Plan optimized by CBO. - -Stage-0 - Fetch Operator - limit:-1 - Stage-1 - Map 1 vectorized, llap - File Output Operator [FS_4] - Select Operator [SEL_3] (rows=2 width=40) - Output:["_col0"] - TableScan [TS_0] (rows=2 width=40) - default@test,test,Tbl:COMPLETE,Col:NONE,Output:["ts"] - PREHOOK: query: SELECT ts FROM test PREHOOK: type: QUERY PREHOOK: Input: default@test @@ -144,35 +99,136 @@ POSTHOOK: Input: default@test #### A masked pattern was here #### 0001-01-01 00:00:00 9999-12-31 23:59:59.999999999 -PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION +PREHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT MIN(ts), MAX(ts), MAX(ts) - MIN(ts) FROM test PREHOOK: type: QUERY -POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION +POSTHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT MIN(ts), MAX(ts), MAX(ts) - MIN(ts) FROM test POSTHOOK: type: QUERY -Plan optimized by CBO. - -Vertex dependency in root stage -Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) - -Stage-0 - Fetch Operator - limit:-1 - Stage-1 - Reducer 2 vectorized, llap - File Output Operator [FS_12] - Select Operator [SEL_11] (rows=1 width=80) - Output:["_col0","_col1","_col2"] - Group By Operator [GBY_10] (rows=1 width=80) - Output:["_col0","_col1"],aggregations:["min(VALUE._col0)","max(VALUE._col1)"] - <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized, llap - PARTITION_ONLY_SHUFFLE [RS_9] - Group By Operator [GBY_8] (rows=1 width=80) - Output:["_col0","_col1"],aggregations:["min(ts)","max(ts)"] - Select Operator [SEL_7] (rows=2 width=40) - Output:["ts"] - TableScan [TS_0] (rows=2 width=40) - default@test,test,Tbl:COMPLETE,Col:NONE,Output:["ts"] +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Edges: + Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: test + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0] + Select Operator + expressions: ts (type: timestamp) + outputColumnNames: ts + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0] + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + Group By Operator + aggregations: min(ts), max(ts) + Group By Vectorization: + aggregators: VectorUDAFMinTimestamp(col 0) -> timestamp, VectorUDAFMaxTimestamp(col 0) -> timestamp + className: VectorGroupByOperator + groupByMode: HASH + vectorOutput: true + native: false + vectorProcessingMode: HASH + projectedOutputColumns: [0, 1] + mode: hash + outputColumnNames: _col0, _col1 + Statistics: Num rows: 1 Data size: 80 Basic stats: COMPLETE Column stats: NONE + Reduce Output Operator + sort order: + Reduce Sink Vectorization: + className: VectorReduceSinkEmptyKeyOperator + keyColumns: [] + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + valueColumns: [0, 1] + Statistics: Num rows: 1 Data size: 80 Basic stats: COMPLETE Column stats: NONE + value expressions: _col0 (type: timestamp), _col1 (type: timestamp) + Execution mode: vectorized, llap + LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 1 + includeColumns: [0] + dataColumns: ts:timestamp + partitionColumnCount: 0 + Reducer 2 + Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + reduceColumnNullOrder: + reduceColumnSortOrder: + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 2 + dataColumns: VALUE._col0:timestamp, VALUE._col1:timestamp + partitionColumnCount: 0 + Reduce Operator Tree: + Group By Operator + aggregations: min(VALUE._col0), max(VALUE._col1) + Group By Vectorization: + aggregators: VectorUDAFMinTimestamp(col 0) -> timestamp, VectorUDAFMaxTimestamp(col 1) -> timestamp + className: VectorGroupByOperator + groupByMode: MERGEPARTIAL + vectorOutput: true + native: false + vectorProcessingMode: GLOBAL + projectedOutputColumns: [0, 1] + mode: mergepartial + outputColumnNames: _col0, _col1 + Statistics: Num rows: 1 Data size: 80 Basic stats: COMPLETE Column stats: NONE + Select Operator + expressions: _col0 (type: timestamp), _col1 (type: timestamp), (_col1 - _col0) (type: interval_day_time) + outputColumnNames: _col0, _col1, _col2 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2] + selectExpressions: TimestampColSubtractTimestampColumn(col 1, col 0) -> 2:interval_day_time + Statistics: Num rows: 1 Data size: 80 Basic stats: COMPLETE Column stats: NONE + File Output Operator + compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false + Statistics: Num rows: 1 Data size: 80 Basic stats: COMPLETE Column stats: NONE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + + Stage: Stage-0 + Fetch Operator + limit: -1 + Processor Tree: + ListSink PREHOOK: query: SELECT MIN(ts), MAX(ts), MAX(ts) - MIN(ts) FROM test PREHOOK: type: QUERY @@ -183,26 +239,79 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@test #### A masked pattern was here #### 0001-01-01 00:00:00 9999-12-31 23:59:59.999999999 3652060 23:59:59.999999999 -PREHOOK: query: EXPLAIN VECTORIZATION EXPRESSION +PREHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT ts FROM test WHERE ts IN (timestamp '0001-01-01 00:00:00.000000000', timestamp '0002-02-02 00:00:00.000000000') PREHOOK: type: QUERY -POSTHOOK: query: EXPLAIN VECTORIZATION EXPRESSION +POSTHOOK: query: EXPLAIN VECTORIZATION DETAIL SELECT ts FROM test WHERE ts IN (timestamp '0001-01-01 00:00:00.000000000', timestamp '0002-02-02 00:00:00.000000000') POSTHOOK: type: QUERY -Plan optimized by CBO. - -Stage-0 - Fetch Operator - limit:-1 - Stage-1 - Map 1 vectorized, llap - File Output Operator [FS_7] - Select Operator [SEL_6] (rows=1 width=40) - Output:["_col0"] - Filter Operator [FIL_5] (rows=1 width=40) - predicate:(ts) IN (0001-01-01 00:00:00.0, 0002-02-02 00:00:00.0) - TableScan [TS_0] (rows=2 width=40) - default@test,test,Tbl:COMPLETE,Col:NONE,Output:["ts"] +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: test + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0] + Filter Operator + Filter Vectorization: + className: VectorFilterOperator + native: true + predicateExpression: FilterTimestampColumnInList(col 0, values [0001-01-01 00:00:00.0, 0002-02-02 00:00:00.0]) -> boolean + predicate: (ts) IN (0001-01-01 00:00:00.0, 0002-02-02 00:00:00.0) (type: boolean) + Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + Select Operator + expressions: ts (type: timestamp) + outputColumnNames: _col0 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0] + Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + File Output Operator + compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false + Statistics: Num rows: 1 Data size: 40 Basic stats: COMPLETE Column stats: NONE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + Execution mode: vectorized, llap + LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 1 + includeColumns: [0] + dataColumns: ts:timestamp + partitionColumnCount: 0 + + Stage: Stage-0 + Fetch Operator + limit: -1 + Processor Tree: + ListSink PREHOOK: query: SELECT ts FROM test WHERE ts IN (timestamp '0001-01-01 00:00:00.000000000', timestamp '0002-02-02 00:00:00.000000000') PREHOOK: type: QUERY @@ -213,3 +322,274 @@ POSTHOOK: type: QUERY POSTHOOK: Input: default@test #### A masked pattern was here #### 0001-01-01 00:00:00 +PREHOOK: query: EXPLAIN VECTORIZATION DETAIL +SELECT AVG(ts), CAST(AVG(ts) AS TIMESTAMP) FROM test +PREHOOK: type: QUERY +POSTHOOK: query: EXPLAIN VECTORIZATION DETAIL +SELECT AVG(ts), CAST(AVG(ts) AS TIMESTAMP) FROM test +POSTHOOK: type: QUERY +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Edges: + Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: test + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0] + Select Operator + expressions: ts (type: timestamp) + outputColumnNames: ts + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0] + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + Group By Operator + aggregations: avg(ts) + Group By Vectorization: + aggregators: VectorUDAFAvgTimestamp(col 0) -> struct + className: VectorGroupByOperator + groupByMode: HASH + vectorOutput: true + native: false + vectorProcessingMode: HASH + projectedOutputColumns: [0] + mode: hash + outputColumnNames: _col0 + Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE + Reduce Output Operator + sort order: + Reduce Sink Vectorization: + className: VectorReduceSinkEmptyKeyOperator + keyColumns: [] + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + valueColumns: [0] + Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE + value expressions: _col0 (type: struct) + Execution mode: vectorized, llap + LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 1 + includeColumns: [0] + dataColumns: ts:timestamp + partitionColumnCount: 0 + Reducer 2 + Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + reduceColumnNullOrder: + reduceColumnSortOrder: + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 1 + dataColumns: VALUE._col0:struct + partitionColumnCount: 0 + Reduce Operator Tree: + Group By Operator + aggregations: avg(VALUE._col0) + Group By Vectorization: + aggregators: VectorUDAFAvgFinal(col 0) -> double + className: VectorGroupByOperator + groupByMode: MERGEPARTIAL + vectorOutput: true + native: false + vectorProcessingMode: GLOBAL + projectedOutputColumns: [0] + mode: mergepartial + outputColumnNames: _col0 + Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE + Select Operator + expressions: _col0 (type: double), CAST( _col0 AS TIMESTAMP) (type: timestamp) + outputColumnNames: _col0, _col1 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1] + selectExpressions: CastDoubleToTimestamp(col 0) -> 1:timestamp + Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE + File Output Operator + compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false + Statistics: Num rows: 1 Data size: 112 Basic stats: COMPLETE Column stats: NONE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + + Stage: Stage-0 + Fetch Operator + limit: -1 + Processor Tree: + ListSink + +PREHOOK: query: SELECT AVG(ts), CAST(AVG(ts) AS TIMESTAMP) FROM test +PREHOOK: type: QUERY +PREHOOK: Input: default@test +#### A masked pattern was here #### +POSTHOOK: query: SELECT AVG(ts), CAST(AVG(ts) AS TIMESTAMP) FROM test +POSTHOOK: type: QUERY +POSTHOOK: Input: default@test +#### A masked pattern was here #### +9.56332944E10 5000-07-01 13:00:00 +PREHOOK: query: EXPLAIN VECTORIZATION DETAIL +SELECT variance(ts), var_pop(ts), var_samp(ts), std(ts), stddev(ts), stddev_pop(ts), stddev_samp(ts) FROM test +PREHOOK: type: QUERY +POSTHOOK: query: EXPLAIN VECTORIZATION DETAIL +SELECT variance(ts), var_pop(ts), var_samp(ts), std(ts), stddev(ts), stddev_pop(ts), stddev_samp(ts) FROM test +POSTHOOK: type: QUERY +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + +STAGE DEPENDENCIES: + Stage-1 is a root stage + Stage-0 depends on stages: Stage-1 + +STAGE PLANS: + Stage: Stage-1 + Tez +#### A masked pattern was here #### + Edges: + Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE) +#### A masked pattern was here #### + Vertices: + Map 1 + Map Operator Tree: + TableScan + alias: test + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0] + Select Operator + expressions: ts (type: timestamp) + outputColumnNames: ts + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0] + Statistics: Num rows: 2 Data size: 80 Basic stats: COMPLETE Column stats: NONE + Group By Operator + aggregations: variance(ts), var_pop(ts), var_samp(ts), std(ts), stddev(ts), stddev_pop(ts), stddev_samp(ts) + Group By Vectorization: + aggregators: VectorUDAFVarPopTimestamp(col 0) -> struct, VectorUDAFVarPopTimestamp(col 0) -> struct, VectorUDAFVarSampTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdSampTimestamp(col 0) -> struct + className: VectorGroupByOperator + groupByMode: HASH + vectorOutput: true + native: false + vectorProcessingMode: HASH + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6] + mode: hash + outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Statistics: Num rows: 1 Data size: 560 Basic stats: COMPLETE Column stats: NONE + Reduce Output Operator + sort order: + Reduce Sink Vectorization: + className: VectorReduceSinkEmptyKeyOperator + keyColumns: [] + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + valueColumns: [0, 1, 2, 3, 4, 5, 6] + Statistics: Num rows: 1 Data size: 560 Basic stats: COMPLETE Column stats: NONE + value expressions: _col0 (type: struct), _col1 (type: struct), _col2 (type: struct), _col3 (type: struct), _col4 (type: struct), _col5 (type: struct), _col6 (type: struct) + Execution mode: vectorized, llap + LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 1 + includeColumns: [0] + dataColumns: ts:timestamp + partitionColumnCount: 0 + Reducer 2 + Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + reduceColumnNullOrder: + reduceColumnSortOrder: + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true + rowBatchContext: + dataColumnCount: 7 + dataColumns: VALUE._col0:struct, VALUE._col1:struct, VALUE._col2:struct, VALUE._col3:struct, VALUE._col4:struct, VALUE._col5:struct, VALUE._col6:struct + partitionColumnCount: 0 + Reduce Operator Tree: + Group By Operator + aggregations: variance(VALUE._col0), var_pop(VALUE._col1), var_samp(VALUE._col2), std(VALUE._col3), stddev(VALUE._col4), stddev_pop(VALUE._col5), stddev_samp(VALUE._col6) + Group By Vectorization: + aggregators: VectorUDAFVarPopFinal(col 0) -> double, VectorUDAFVarPopFinal(col 1) -> double, VectorUDAFVarSampFinal(col 2) -> double, VectorUDAFStdPopFinal(col 3) -> double, VectorUDAFStdPopFinal(col 4) -> double, VectorUDAFStdPopFinal(col 5) -> double, VectorUDAFStdSampFinal(col 6) -> double + className: VectorGroupByOperator + groupByMode: MERGEPARTIAL + vectorOutput: true + native: false + vectorProcessingMode: GLOBAL + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6] + mode: mergepartial + outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Statistics: Num rows: 1 Data size: 560 Basic stats: COMPLETE Column stats: NONE + File Output Operator + compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false + Statistics: Num rows: 1 Data size: 560 Basic stats: COMPLETE Column stats: NONE + table: + input format: org.apache.hadoop.mapred.SequenceFileInputFormat + output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat + serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe + + Stage: Stage-0 + Fetch Operator + limit: -1 + Processor Tree: + ListSink + +PREHOOK: query: SELECT variance(ts), var_pop(ts), var_samp(ts), std(ts), stddev(ts), stddev_pop(ts), stddev_samp(ts) FROM test +PREHOOK: type: QUERY +PREHOOK: Input: default@test +#### A masked pattern was here #### +POSTHOOK: query: SELECT variance(ts), var_pop(ts), var_samp(ts), std(ts), stddev(ts), stddev_pop(ts), stddev_samp(ts) FROM test +POSTHOOK: type: QUERY +POSTHOOK: Input: default@test +#### A masked pattern was here #### +2.489106846793884E22 2.489106846793884E22 4.978213693587768E22 1.577690352E11 1.577690352E11 1.577690352E11 2.2311910930235822E11 http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/llap/vectorized_timestamp_funcs.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vectorized_timestamp_funcs.q.out b/ql/src/test/results/clientpositive/llap/vectorized_timestamp_funcs.q.out index e326f5f..f6dcb7c 100644 --- a/ql/src/test/results/clientpositive/llap/vectorized_timestamp_funcs.q.out +++ b/ql/src/test/results/clientpositive/llap/vectorized_timestamp_funcs.q.out @@ -809,8 +809,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFMinTimestamp(col 0) -> timestamp, VectorUDAFMaxTimestamp(col 0) -> timestamp, VectorUDAFCount(col 0) -> bigint, VectorUDAFCountStar(*) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0, 1, 2, 3] mode: hash outputColumnNames: _col0, _col1, _col2, _col3 @@ -848,8 +850,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFMinTimestamp(col 0) -> timestamp, VectorUDAFMaxTimestamp(col 1) -> timestamp, VectorUDAFCountMerge(col 2) -> bigint, VectorUDAFCountMerge(col 3) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0, 1, 2, 3] mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 @@ -919,27 +923,48 @@ STAGE PLANS: TableScan alias: alltypesorc_string Statistics: Num rows: 40 Data size: 84 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: ctimestamp1 (type: timestamp) outputColumnNames: ctimestamp1 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0] Statistics: Num rows: 40 Data size: 84 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: sum(ctimestamp1) + Group By Vectorization: + aggregators: VectorUDAFSumTimestamp(col 0) -> double + className: VectorGroupByOperator + groupByMode: HASH + vectorOutput: true + native: false + vectorProcessingMode: HASH + projectedOutputColumns: [0] mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: + Reduce Sink Vectorization: + className: VectorReduceSinkEmptyKeyOperator + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: double) - Execution mode: llap + Execution mode: vectorized, llap LLAP IO: all inputs Map Vectorization: enabled: true enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat - notVectorizedReason: Aggregation Function expression for GROUPBY operator: Vectorization of aggreation should have succeeded org.apache.hadoop.hive.ql.metadata.HiveException: Vector aggregate not implemented: "sum" for type: "TIMESTAMP (UDAF evaluator mode = PARTIAL1) - vectorized: false + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap Reduce Vectorization: @@ -955,8 +980,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFSumDouble(col 0) -> double className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0] mode: mergepartial outputColumnNames: _col0 @@ -1057,17 +1084,22 @@ STAGE PLANS: Group By Operator aggregations: avg(ctimestamp1), variance(ctimestamp1), var_pop(ctimestamp1), var_samp(ctimestamp1), std(ctimestamp1), stddev(ctimestamp1), stddev_pop(ctimestamp1), stddev_samp(ctimestamp1) Group By Vectorization: - aggregators: VectorUDAFAvgTimestamp(col 0) -> struct, VectorUDAFVarPopTimestamp(col 0) -> struct, VectorUDAFVarPopTimestamp(col 0) -> struct, VectorUDAFVarSampTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdSampTimestamp(col 0) -> struct + aggregators: VectorUDAFAvgTimestamp(col 0) -> struct, VectorUDAFVarPopTimestamp(col 0) -> struct, VectorUDAFVarPopTimestamp(col 0) -> struct, VectorUDAFVarSampTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdPopTimestamp(col 0) -> struct, VectorUDAFStdSampTimestamp(col 0) -> struct className: VectorGroupByOperator - vectorOutput: false + groupByMode: HASH + vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7] - vectorOutputConditionsNotMet: Vector output of VectorUDAFAvgTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFVarPopTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFVarPopTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFVarSampTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdPopTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdPopTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdPopTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdSampTimestamp(col 0) -> struct output type STRUCT requires PRIMITIVE IS false mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7 Statistics: Num rows: 1 Data size: 672 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: + Reduce Sink Vectorization: + className: VectorReduceSinkEmptyKeyOperator + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true Statistics: Num rows: 1 Data size: 672 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: struct), _col1 (type: struct), _col2 (type: struct), _col3 (type: struct), _col4 (type: struct), _col5 (type: struct), _col6 (type: struct), _col7 (type: struct) Execution mode: vectorized, llap @@ -1075,30 +1107,48 @@ STAGE PLANS: Map Vectorization: enabled: true enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true - groupByVectorOutput: false + groupByVectorOutput: true inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat allNative: false usesVectorUDFAdaptor: false vectorized: true Reducer 2 - Execution mode: llap + Execution mode: vectorized, llap Reduce Vectorization: enabled: true enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true - notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator: Data type struct of Column[VALUE._col0] not supported - vectorized: false + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: true + vectorized: true Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), variance(VALUE._col1), var_pop(VALUE._col2), var_samp(VALUE._col3), std(VALUE._col4), stddev(VALUE._col5), stddev_pop(VALUE._col6), stddev_samp(VALUE._col7) + Group By Vectorization: + aggregators: VectorUDAFAvgFinal(col 0) -> double, VectorUDAFVarPopFinal(col 1) -> double, VectorUDAFVarPopFinal(col 2) -> double, VectorUDAFVarSampFinal(col 3) -> double, VectorUDAFStdPopFinal(col 4) -> double, VectorUDAFStdPopFinal(col 5) -> double, VectorUDAFStdPopFinal(col 6) -> double, VectorUDAFStdSampFinal(col 7) -> double + className: VectorGroupByOperator + groupByMode: MERGEPARTIAL + vectorOutput: true + native: false + vectorProcessingMode: GLOBAL + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7] mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7 Statistics: Num rows: 1 Data size: 672 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: round(_col0, 0) (type: double), _col1 BETWEEN 8.97077295279421E19 AND 8.97077295279422E19 (type: boolean), _col2 BETWEEN 8.97077295279421E19 AND 8.97077295279422E19 (type: boolean), _col3 BETWEEN 9.20684592523616E19 AND 9.20684592523617E19 (type: boolean), round(_col4, 3) (type: double), round(_col5, 3) (type: double), round(_col6, 3) (type: double), round(_col7, 3) (type: double) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [8, 9, 10, 11, 12, 13, 14, 15] + selectExpressions: RoundWithNumDigitsDoubleToDouble(col 0, decimalPlaces 0) -> 8:double, VectorUDFAdaptor(_col1 BETWEEN 8.97077295279421E19 AND 8.97077295279422E19) -> 9:boolean, VectorUDFAdaptor(_col2 BETWEEN 8.97077295279421E19 AND 8.97077295279422E19) -> 10:boolean, VectorUDFAdaptor(_col3 BETWEEN 9.20684592523616E19 AND 9.20684592523617E19) -> 11:boolean, RoundWithNumDigitsDoubleToDouble(col 4, decimalPlaces 3) -> 12:double, RoundWithNumDigitsDoubleToDouble(col 5, decimalPlaces 3) -> 13:double, RoundWithNumDigitsDoubleToDouble(col 6, decimalPlaces 3) -> 14:double, RoundWithNumDigitsDoubleToDouble(col 7, decimalPlaces 3) -> 15:double Statistics: Num rows: 1 Data size: 672 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 1 Data size: 672 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_between_in.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_between_in.q.out b/ql/src/test/results/clientpositive/spark/vector_between_in.q.out index 9329ba7..2f87841 100644 --- a/ql/src/test/results/clientpositive/spark/vector_between_in.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_between_in.q.out @@ -151,8 +151,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountStar(*) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] mode: hash outputColumnNames: _col0 @@ -189,8 +191,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 0) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0] mode: mergepartial outputColumnNames: _col0 @@ -351,8 +355,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountStar(*) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] mode: hash outputColumnNames: _col0 @@ -389,8 +395,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 0) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0] mode: mergepartial outputColumnNames: _col0 @@ -739,8 +747,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountStar(*) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] mode: hash outputColumnNames: _col0 @@ -777,8 +787,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 0) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0] mode: mergepartial outputColumnNames: _col0 @@ -1087,9 +1099,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCount(ConstantVectorExpression(val 1) -> 5:long) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 4 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] keys: _col0 (type: boolean) mode: hash @@ -1129,9 +1143,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 1) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0] keys: KEY._col0 (type: boolean) mode: mergepartial @@ -1223,9 +1239,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCount(ConstantVectorExpression(val 1) -> 5:long) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 4 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] keys: _col0 (type: boolean) mode: hash @@ -1265,9 +1283,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 1) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0] keys: KEY._col0 (type: boolean) mode: mergepartial @@ -1359,9 +1379,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCount(ConstantVectorExpression(val 1) -> 5:long) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 4 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] keys: _col0 (type: boolean) mode: hash @@ -1401,9 +1423,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 1) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0] keys: KEY._col0 (type: boolean) mode: mergepartial @@ -1495,9 +1519,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCount(ConstantVectorExpression(val 1) -> 5:long) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 4 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] keys: _col0 (type: boolean) mode: hash @@ -1537,9 +1563,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 1) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0] keys: KEY._col0 (type: boolean) mode: mergepartial http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_cast_constant.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_cast_constant.q.out b/ql/src/test/results/clientpositive/spark/vector_cast_constant.q.out index 0aa347b..c69bc81 100644 --- a/ql/src/test/results/clientpositive/spark/vector_cast_constant.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_cast_constant.q.out @@ -144,13 +144,14 @@ STAGE PLANS: Group By Operator aggregations: avg(50), avg(50.0), avg(50) Group By Vectorization: - aggregators: VectorUDAFAvgLong(ConstantVectorExpression(val 50) -> 11:long) -> struct, VectorUDAFAvgDouble(ConstantVectorExpression(val 50.0) -> 12:double) -> struct, VectorUDAFAvgDecimal(ConstantVectorExpression(val 50) -> 13:decimal(10,0)) -> struct + aggregators: VectorUDAFAvgLong(ConstantVectorExpression(val 50) -> 11:long) -> struct, VectorUDAFAvgDouble(ConstantVectorExpression(val 50.0) -> 12:double) -> struct, VectorUDAFAvgDecimal(ConstantVectorExpression(val 50) -> 13:decimal(10,0)) -> struct className: VectorGroupByOperator - vectorOutput: false + groupByMode: HASH + vectorOutput: true keyExpressions: col 2 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0, 1, 2] - vectorOutputConditionsNotMet: Vector output of VectorUDAFAvgLong(ConstantVectorExpression(val 50) -> 11:long) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFAvgDouble(ConstantVectorExpression(val 50.0) -> 12:double) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFAvgDecimal(ConstantVectorExpression(val 50) -> 13:decimal(10,0)) -> struct output type STRUCT requires PRIMITIVE IS false keys: _col0 (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3 @@ -159,6 +160,10 @@ STAGE PLANS: key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) + Reduce Sink Vectorization: + className: VectorReduceSinkObjectHashOperator + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true Statistics: Num rows: 1049 Data size: 311170 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.1 value expressions: _col1 (type: struct), _col2 (type: struct), _col3 (type: struct) @@ -166,20 +171,32 @@ STAGE PLANS: Map Vectorization: enabled: true enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true - groupByVectorOutput: false + groupByVectorOutput: true inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat allNative: false usesVectorUDFAdaptor: false vectorized: true Reducer 2 + Execution mode: vectorized Reduce Vectorization: enabled: true enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true - notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator: Data type struct of Column[VALUE._col0] not supported - vectorized: false + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Group By Operator aggregations: avg(VALUE._col0), avg(VALUE._col1), avg(VALUE._col2) + Group By Vectorization: + aggregators: VectorUDAFAvgFinal(col 1) -> double, VectorUDAFAvgFinal(col 2) -> double, VectorUDAFAvgDecimalFinal(col 3) -> decimal(16,4) + className: VectorGroupByOperator + groupByMode: MERGEPARTIAL + vectorOutput: true + keyExpressions: col 0 + native: false + vectorProcessingMode: MERGE_PARTIAL + projectedOutputColumns: [0, 1, 2] keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3 @@ -187,6 +204,10 @@ STAGE PLANS: Reduce Output Operator key expressions: _col0 (type: int) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkObjectHashOperator + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true Statistics: Num rows: 524 Data size: 155436 Basic stats: COMPLETE Column stats: NONE TopN Hash Memory Usage: 0.1 value expressions: _col1 (type: double), _col2 (type: double), _col3 (type: decimal(14,4)) http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_count_distinct.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_count_distinct.q.out b/ql/src/test/results/clientpositive/spark/vector_count_distinct.q.out index b663831..9af0786 100644 --- a/ql/src/test/results/clientpositive/spark/vector_count_distinct.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_count_distinct.q.out @@ -1266,9 +1266,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 16 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: ws_order_number (type: int) mode: hash @@ -1305,9 +1307,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [] keys: KEY._col0 (type: int) mode: mergepartial @@ -1318,8 +1322,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCount(col 0) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] mode: hash outputColumnNames: _col0 @@ -1347,8 +1353,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 0) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0] mode: mergepartial outputColumnNames: _col0 http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_decimal_aggregate.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_decimal_aggregate.q.out b/ql/src/test/results/clientpositive/spark/vector_decimal_aggregate.q.out index edda919..9994f2b 100644 --- a/ql/src/test/results/clientpositive/spark/vector_decimal_aggregate.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_decimal_aggregate.q.out @@ -70,9 +70,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCount(col 1) -> bigint, VectorUDAFMaxDecimal(col 1) -> decimal(20,10), VectorUDAFMinDecimal(col 1) -> decimal(20,10), VectorUDAFSumDecimal(col 1) -> decimal(38,18), VectorUDAFCount(col 2) -> bigint, VectorUDAFMaxDecimal(col 2) -> decimal(23,14), VectorUDAFMinDecimal(col 2) -> decimal(23,14), VectorUDAFSumDecimal(col 2) -> decimal(38,18), VectorUDAFCountStar(*) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 3 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8] keys: cint (type: int) mode: hash @@ -112,9 +114,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 1) -> bigint, VectorUDAFMaxDecimal(col 2) -> decimal(20,10), VectorUDAFMinDecimal(col 3) -> decimal(20,10), VectorUDAFSumDecimal(col 4) -> decimal(38,18), VectorUDAFCountMerge(col 5) -> bigint, VectorUDAFMaxDecimal(col 6) -> decimal(23,14), VectorUDAFMinDecimal(col 7) -> decimal(23,14), VectorUDAFSumDecimal(col 8) -> decimal(38,18), VectorUDAFCountMerge(col 9) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8] keys: KEY._col0 (type: int) mode: mergepartial @@ -226,13 +230,14 @@ STAGE PLANS: Group By Operator aggregations: count(cdecimal1), max(cdecimal1), min(cdecimal1), sum(cdecimal1), avg(cdecimal1), stddev_pop(cdecimal1), stddev_samp(cdecimal1), count(cdecimal2), max(cdecimal2), min(cdecimal2), sum(cdecimal2), avg(cdecimal2), stddev_pop(cdecimal2), stddev_samp(cdecimal2), count() Group By Vectorization: - aggregators: VectorUDAFCount(col 1) -> bigint, VectorUDAFMaxDecimal(col 1) -> decimal(20,10), VectorUDAFMinDecimal(col 1) -> decimal(20,10), VectorUDAFSumDecimal(col 1) -> decimal(38,18), VectorUDAFAvgDecimal(col 1) -> struct, VectorUDAFStdPopDecimal(col 1) -> struct, VectorUDAFStdSampDecimal(col 1) -> struct, VectorUDAFCount(col 2) -> bigint, VectorUDAFMaxDecimal(col 2) -> decimal(23,14), VectorUDAFMinDecimal(col 2) -> decimal(23,14), VectorUDAFSumDecimal(col 2) -> decimal(38,18), VectorUDAFAvgDecimal(col 2) -> struct, VectorUDAFStdPopDecimal(col 2) -> struct, VectorUDAFStdSampDecimal(col 2) -> struct, VectorUDAFCountStar(*) -> bigint + aggregators: VectorUDAFCount(col 1) -> bigint, VectorUDAFMaxDecimal(col 1) -> decimal(20,10), VectorUDAFMinDecimal(col 1) -> decimal(20,10), VectorUDAFSumDecimal(col 1) -> decimal(38,18), VectorUDAFAvgDecimal(col 1) -> struct, VectorUDAFStdPopDecimal(col 1) -> struct, VectorUDAFStdSampDecimal(col 1) -> struct, VectorUDAFCount(col 2) -> bigint, VectorUDAFMaxDecimal(col 2) -> decimal(23,14), VectorUDAFMinDecimal(col 2) -> decimal(23,14), VectorUDAFSumDecimal(col 2) -> decimal(38,18), VectorUDAFAvgDecimal(col 2) -> struct, VectorUDAFStdPopDecimal(col 2) -> struct, VectorUDAFStdSampDecimal(col 2) -> struct, VectorUDAFCountStar(*) -> bigint className: VectorGroupByOperator - vectorOutput: false + groupByMode: HASH + vectorOutput: true keyExpressions: col 3 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] - vectorOutputConditionsNotMet: Vector output of VectorUDAFAvgDecimal(col 1) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdPopDecimal(col 1) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdSampDecimal(col 1) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFAvgDecimal(col 2) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdPopDecimal(col 2) -> struct output type STRUCT requires PRIMITIVE IS false, Vector output of VectorUDAFStdSampDecimal(col 2) -> struct output type STRUCT requires PRIMITIVE IS false keys: cint (type: int) mode: hash outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 @@ -241,39 +246,66 @@ STAGE PLANS: key expressions: _col0 (type: int) sort order: + Map-reduce partition columns: _col0 (type: int) + Reduce Sink Vectorization: + className: VectorReduceSinkObjectHashOperator + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true, No PTF TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true Statistics: Num rows: 12288 Data size: 2165060 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: bigint), _col2 (type: decimal(20,10)), _col3 (type: decimal(20,10)), _col4 (type: decimal(30,10)), _col5 (type: struct), _col6 (type: struct), _col7 (type: struct), _col8 (type: bigint), _col9 (type: decimal(23,14)), _col10 (type: decimal(23,14)), _col11 (type: decimal(33,14)), _col12 (type: struct), _col13 (type: struct), _col14 (type: struct), _col15 (type: bigint) Execution mode: vectorized Map Vectorization: enabled: true enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true - groupByVectorOutput: false + groupByVectorOutput: true inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat allNative: false usesVectorUDFAdaptor: false vectorized: true Reducer 2 + Execution mode: vectorized Reduce Vectorization: enabled: true enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine spark IN [tez, spark] IS true - notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator: Data type struct of Column[VALUE._col4] not supported - vectorized: false + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0), max(VALUE._col1), min(VALUE._col2), sum(VALUE._col3), avg(VALUE._col4), stddev_pop(VALUE._col5), stddev_samp(VALUE._col6), count(VALUE._col7), max(VALUE._col8), min(VALUE._col9), sum(VALUE._col10), avg(VALUE._col11), stddev_pop(VALUE._col12), stddev_samp(VALUE._col13), count(VALUE._col14) + Group By Vectorization: + aggregators: VectorUDAFCountMerge(col 1) -> bigint, VectorUDAFMaxDecimal(col 2) -> decimal(20,10), VectorUDAFMinDecimal(col 3) -> decimal(20,10), VectorUDAFSumDecimal(col 4) -> decimal(38,18), VectorUDAFAvgDecimalFinal(col 5) -> decimal(34,14), VectorUDAFStdPopFinal(col 6) -> double, VectorUDAFStdSampFinal(col 7) -> double, VectorUDAFCountMerge(col 8) -> bigint, VectorUDAFMaxDecimal(col 9) -> decimal(23,14), VectorUDAFMinDecimal(col 10) -> decimal(23,14), VectorUDAFSumDecimal(col 11) -> decimal(38,18), VectorUDAFAvgDecimalFinal(col 12) -> decimal(37,18), VectorUDAFStdPopFinal(col 13) -> double, VectorUDAFStdSampFinal(col 14) -> double, VectorUDAFCountMerge(col 15) -> bigint + className: VectorGroupByOperator + groupByMode: MERGEPARTIAL + vectorOutput: true + keyExpressions: col 0 + native: false + vectorProcessingMode: MERGE_PARTIAL + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] keys: KEY._col0 (type: int) mode: mergepartial outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15 Statistics: Num rows: 6144 Data size: 1082530 Basic stats: COMPLETE Column stats: NONE Filter Operator + Filter Vectorization: + className: VectorFilterOperator + native: true + predicateExpression: FilterLongColGreaterLongScalar(col 15, val 1) -> boolean predicate: (_col15 > 1) (type: boolean) Statistics: Num rows: 2048 Data size: 360843 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: _col0 (type: int), _col1 (type: bigint), _col2 (type: decimal(20,10)), _col3 (type: decimal(20,10)), _col4 (type: decimal(30,10)), _col5 (type: decimal(24,14)), _col6 (type: double), _col7 (type: double), _col8 (type: bigint), _col9 (type: decimal(23,14)), _col10 (type: decimal(23,14)), _col11 (type: decimal(33,14)), _col12 (type: decimal(27,18)), _col13 (type: double), _col14 (type: double) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14] Statistics: Num rows: 2048 Data size: 360843 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 2048 Data size: 360843 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_distinct_2.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_distinct_2.q.out b/ql/src/test/results/clientpositive/spark/vector_distinct_2.q.out index 59dcf7c..aff53a6 100644 --- a/ql/src/test/results/clientpositive/spark/vector_distinct_2.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_distinct_2.q.out @@ -141,9 +141,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 0, col 8 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: t (type: tinyint), s (type: string) mode: hash @@ -180,9 +182,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0, col 1 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [] keys: KEY._col0 (type: tinyint), KEY._col1 (type: string) mode: mergepartial http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_groupby_3.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_groupby_3.q.out b/ql/src/test/results/clientpositive/spark/vector_groupby_3.q.out index 94b3ef6..83f8604 100644 --- a/ql/src/test/results/clientpositive/spark/vector_groupby_3.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_groupby_3.q.out @@ -143,9 +143,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFMaxLong(col 3) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 0, col 8 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] keys: t (type: tinyint), s (type: string) mode: hash @@ -185,9 +187,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFMaxLong(col 2) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0, col 1 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0] keys: KEY._col0 (type: tinyint), KEY._col1 (type: string) mode: mergepartial http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_inner_join.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_inner_join.q.out b/ql/src/test/results/clientpositive/spark/vector_inner_join.q.out index 3a9f97b..62383c4 100644 --- a/ql/src/test/results/clientpositive/spark/vector_inner_join.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_inner_join.q.out @@ -238,9 +238,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: _col0 (type: int) mode: hash http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out b/ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out index 2f2609f..433b9a2 100644 --- a/ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_mapjoin_reduce.q.out @@ -91,9 +91,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: _col0 (type: int) mode: hash @@ -142,9 +144,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 1 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: l_partkey (type: int) mode: hash @@ -183,9 +187,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [] keys: KEY._col0 (type: int) mode: mergepartial @@ -362,9 +368,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 0, col 3 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: _col0 (type: int), _col1 (type: int) mode: hash @@ -413,9 +421,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 1 native: false + vectorProcessingMode: HASH projectedOutputColumns: [] keys: l_partkey (type: int) mode: hash @@ -454,9 +464,11 @@ STAGE PLANS: Group By Operator Group By Vectorization: className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [] keys: KEY._col0 (type: int) mode: mergepartial http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_orderby_5.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_orderby_5.q.out b/ql/src/test/results/clientpositive/spark/vector_orderby_5.q.out index fd3469c..dc394c8 100644 --- a/ql/src/test/results/clientpositive/spark/vector_orderby_5.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_orderby_5.q.out @@ -144,9 +144,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFMaxLong(col 3) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true keyExpressions: col 7 native: false + vectorProcessingMode: HASH projectedOutputColumns: [0] keys: bo (type: boolean) mode: hash @@ -186,9 +188,11 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFMaxLong(col 1) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true keyExpressions: col 0 native: false + vectorProcessingMode: MERGE_PARTIAL projectedOutputColumns: [0] keys: KEY._col0 (type: boolean) mode: mergepartial http://git-wip-us.apache.org/repos/asf/hive/blob/92fbe256/ql/src/test/results/clientpositive/spark/vector_outer_join1.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/spark/vector_outer_join1.q.out b/ql/src/test/results/clientpositive/spark/vector_outer_join1.q.out index 03e3a47..5554788 100644 --- a/ql/src/test/results/clientpositive/spark/vector_outer_join1.q.out +++ b/ql/src/test/results/clientpositive/spark/vector_outer_join1.q.out @@ -817,8 +817,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountStar(*) -> bigint, VectorUDAFSumLong(col 0) -> bigint className: VectorGroupByOperator + groupByMode: HASH vectorOutput: true native: false + vectorProcessingMode: HASH projectedOutputColumns: [0, 1] mode: hash outputColumnNames: _col0, _col1 @@ -870,8 +872,10 @@ STAGE PLANS: Group By Vectorization: aggregators: VectorUDAFCountMerge(col 0) -> bigint, VectorUDAFSumLong(col 1) -> bigint className: VectorGroupByOperator + groupByMode: MERGEPARTIAL vectorOutput: true native: false + vectorProcessingMode: GLOBAL projectedOutputColumns: [0, 1] mode: mergepartial outputColumnNames: _col0, _col1