Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 992D4200BC7 for ; Sat, 15 Oct 2016 00:16:21 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 97D24160B02; Fri, 14 Oct 2016 22:16:21 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 971AD160AFA for ; Sat, 15 Oct 2016 00:16:19 +0200 (CEST) Received: (qmail 25905 invoked by uid 500); 14 Oct 2016 22:16:14 -0000 Mailing-List: contact commits-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hive-dev@hive.apache.org Delivered-To: mailing list commits@hive.apache.org Received: (qmail 24429 invoked by uid 99); 14 Oct 2016 22:16:13 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Oct 2016 22:16:13 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 0C6A8DFD4C; Fri, 14 Oct 2016 22:16:13 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: mmccline@apache.org To: commits@hive.apache.org Date: Fri, 14 Oct 2016 22:16:36 -0000 Message-Id: In-Reply-To: References: X-Mailer: ASF-Git Admin Mailer Subject: [25/51] [partial] hive git commit: Revert "Revert "HIVE-11394: Enhance EXPLAIN display for vectorization (Matt McCline, reviewed by Gopal Vijayaraghavan)"" archived-at: Fri, 14 Oct 2016 22:16:21 -0000 http://git-wip-us.apache.org/repos/asf/hive/blob/16d28b34/ql/src/test/results/clientpositive/llap/vector_interval_arithmetic.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vector_interval_arithmetic.q.out b/ql/src/test/results/clientpositive/llap/vector_interval_arithmetic.q.out index 13a8b35..ab7a103 100644 --- a/ql/src/test/results/clientpositive/llap/vector_interval_arithmetic.q.out +++ b/ql/src/test/results/clientpositive/llap/vector_interval_arithmetic.q.out @@ -36,7 +36,7 @@ POSTHOOK: Lineage: interval_arithmetic_1.dateval EXPRESSION [(unique_timestamps) POSTHOOK: Lineage: interval_arithmetic_1.tsval SIMPLE [(unique_timestamps)unique_timestamps.FieldSchema(name:tsval, type:timestamp, comment:null), ] tsval tsval PREHOOK: query: -- interval year-month arithmetic -explain +explain vectorization expression select dateval, dateval - interval '2-2' year to month, @@ -49,7 +49,7 @@ from interval_arithmetic_1 order by dateval PREHOOK: type: QUERY POSTHOOK: query: -- interval year-month arithmetic -explain +explain vectorization expression select dateval, dateval - interval '2-2' year to month, @@ -62,6 +62,10 @@ from interval_arithmetic_1 order by dateval POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -79,26 +83,61 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: dateval (type: date), (dateval - 2-2) (type: date), (dateval - -2-2) (type: date), (dateval + 2-2) (type: date), (dateval + -2-2) (type: date), (-2-2 + dateval) (type: date), (2-2 + dateval) (type: date) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 2, 3, 4, 5, 6, 7] + selectExpressions: DateColSubtractIntervalYearMonthScalar(col 0, val 2-2) -> 2:long, DateColSubtractIntervalYearMonthScalar(col 0, val -2-2) -> 3:long, DateColAddIntervalYearMonthScalar(col 0, val 2-2) -> 4:long, DateColAddIntervalYearMonthScalar(col 0, val -2-2) -> 5:long, IntervalYearMonthScalarAddDateColumn(val -2-2, col 0) -> 6:long, IntervalYearMonthScalarAddDateColumn(val 2-2, col 0) -> 7:long Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: date) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: Uniform Hash IS false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: date), _col2 (type: date), _col3 (type: date), _col4 (type: date), _col5 (type: date), _col6 (type: date) Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: date), VALUE._col0 (type: date), VALUE._col1 (type: date), VALUE._col2 (type: date), VALUE._col3 (type: date), VALUE._col4 (type: date), VALUE._col5 (type: date) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6] Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -188,7 +227,7 @@ dateval c1 c2 c3 c4 c5 c6 9075-06-13 9073-04-13 9077-08-13 9077-08-13 9073-04-13 9073-04-13 9077-08-13 9209-11-11 9207-09-11 9212-01-11 9212-01-11 9207-09-11 9207-09-11 9212-01-11 9403-01-09 9400-11-09 9405-03-09 9405-03-09 9400-11-09 9400-11-09 9405-03-09 -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select dateval, dateval - date '1999-06-07', @@ -197,7 +236,7 @@ select from interval_arithmetic_1 order by dateval PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select dateval, dateval - date '1999-06-07', @@ -207,6 +246,10 @@ from interval_arithmetic_1 order by dateval POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -224,26 +267,61 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: dateval (type: date), (dateval - 1999-06-07) (type: interval_day_time), (1999-06-07 - dateval) (type: interval_day_time), (dateval - dateval) (type: interval_day_time) outputColumnNames: _col0, _col1, _col2, _col3 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 2, 3, 4] + selectExpressions: DateColSubtractDateScalar(col 0, val 1999-06-07 00:00:00.0) -> 2:timestamp, DateScalarSubtractDateColumn(val 1999-06-07 00:00:00.0, col 0) -> 3:timestamp, DateColSubtractDateColumn(col 0, col 0) -> 4:timestamp Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: date) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: Uniform Hash IS false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: interval_day_time), _col2 (type: interval_day_time), _col3 (type: interval_day_time) Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: date), VALUE._col0 (type: interval_day_time), VALUE._col1 (type: interval_day_time), VALUE._col2 (type: interval_day_time) outputColumnNames: _col0, _col1, _col2, _col3 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3] Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -327,7 +405,7 @@ dateval c1 c2 c3 9075-06-13 2584462 00:00:00.000000000 -2584462 00:00:00.000000000 0 00:00:00.000000000 9209-11-11 2633556 01:00:00.000000000 -2633556 01:00:00.000000000 0 00:00:00.000000000 9403-01-09 2704106 01:00:00.000000000 -2704106 01:00:00.000000000 0 00:00:00.000000000 -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select tsval, tsval - interval '2-2' year to month, @@ -339,7 +417,7 @@ select from interval_arithmetic_1 order by tsval PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select tsval, tsval - interval '2-2' year to month, @@ -352,6 +430,10 @@ from interval_arithmetic_1 order by tsval POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -369,26 +451,61 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: tsval (type: timestamp), (tsval - 2-2) (type: timestamp), (tsval - -2-2) (type: timestamp), (tsval + 2-2) (type: timestamp), (tsval + -2-2) (type: timestamp), (-2-2 + tsval) (type: timestamp), (2-2 + tsval) (type: timestamp) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [1, 2, 3, 4, 5, 6, 7] + selectExpressions: TimestampColSubtractIntervalYearMonthScalar(col 1, val 2-2) -> 2:timestamp, TimestampColSubtractIntervalYearMonthScalar(col 1, val -2-2) -> 3:timestamp, TimestampColAddIntervalYearMonthScalar(col 1, val 2-2) -> 4:timestamp, TimestampColAddIntervalYearMonthScalar(col 1, val -2-2) -> 5:timestamp, IntervalYearMonthScalarAddTimestampColumn(val -2-2, col 1) -> 6:timestamp, IntervalYearMonthScalarAddTimestampColumn(val 2-2, col 1) -> 7:timestamp Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: timestamp) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: Uniform Hash IS false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: timestamp), _col2 (type: timestamp), _col3 (type: timestamp), _col4 (type: timestamp), _col5 (type: timestamp), _col6 (type: timestamp) Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: timestamp), VALUE._col0 (type: timestamp), VALUE._col1 (type: timestamp), VALUE._col2 (type: timestamp), VALUE._col3 (type: timestamp), VALUE._col4 (type: timestamp), VALUE._col5 (type: timestamp) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6] Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -478,7 +595,7 @@ tsval c1 c2 c3 c4 c5 c6 9075-06-13 16:20:09.218517797 9073-04-13 16:20:09.218517797 9077-08-13 16:20:09.218517797 9077-08-13 16:20:09.218517797 9073-04-13 16:20:09.218517797 9073-04-13 16:20:09.218517797 9077-08-13 16:20:09.218517797 9209-11-11 04:08:58.223768453 9207-09-11 05:08:58.223768453 9212-01-11 04:08:58.223768453 9212-01-11 04:08:58.223768453 9207-09-11 05:08:58.223768453 9207-09-11 05:08:58.223768453 9212-01-11 04:08:58.223768453 9403-01-09 18:12:33.547 9400-11-09 18:12:33.547 9405-03-09 18:12:33.547 9405-03-09 18:12:33.547 9400-11-09 18:12:33.547 9400-11-09 18:12:33.547 9405-03-09 18:12:33.547 -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select interval '2-2' year to month + interval '3-3' year to month, interval '2-2' year to month - interval '3-3' year to month @@ -486,7 +603,7 @@ from interval_arithmetic_1 order by interval '2-2' year to month + interval '3-3' year to month limit 2 PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select interval '2-2' year to month + interval '3-3' year to month, interval '2-2' year to month - interval '3-3' year to month @@ -495,6 +612,10 @@ order by interval '2-2' year to month + interval '3-3' year to month limit 2 POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -512,26 +633,64 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: COMPLETE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [] Statistics: Num rows: 50 Data size: 800 Basic stats: COMPLETE Column stats: COMPLETE Reduce Output Operator sort order: + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: No TopN IS false, Uniform Hash IS false Statistics: Num rows: 50 Data size: 800 Basic stats: COMPLETE Column stats: COMPLETE TopN Hash Memory Usage: 0.1 Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: 5-5 (type: interval_year_month), -1-1 (type: interval_year_month) outputColumnNames: _col0, _col1 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1] + selectExpressions: ConstantVectorExpression(val 65) -> 0:long, ConstantVectorExpression(val -13) -> 1:long Statistics: Num rows: 50 Data size: 800 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 + Limit Vectorization: + className: VectorLimitOperator + native: true Statistics: Num rows: 2 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 2 Data size: 32 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -566,7 +725,7 @@ c0 c1 5-5 -1-1 5-5 -1-1 PREHOOK: query: -- interval day-time arithmetic -explain +explain vectorization expression select dateval, dateval - interval '99 11:22:33.123456789' day to second, @@ -579,7 +738,7 @@ from interval_arithmetic_1 order by dateval PREHOOK: type: QUERY POSTHOOK: query: -- interval day-time arithmetic -explain +explain vectorization expression select dateval, dateval - interval '99 11:22:33.123456789' day to second, @@ -592,6 +751,10 @@ from interval_arithmetic_1 order by dateval POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -609,26 +772,61 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: dateval (type: date), (dateval - 99 11:22:33.123456789) (type: timestamp), (dateval - -99 11:22:33.123456789) (type: timestamp), (dateval + 99 11:22:33.123456789) (type: timestamp), (dateval + -99 11:22:33.123456789) (type: timestamp), (-99 11:22:33.123456789 + dateval) (type: timestamp), (99 11:22:33.123456789 + dateval) (type: timestamp) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 2, 3, 4, 5, 6, 7] + selectExpressions: DateColSubtractIntervalDayTimeScalar(col 0, val 99 11:22:33.123456789) -> 2:timestamp, DateColSubtractIntervalDayTimeScalar(col 0, val -99 11:22:33.123456789) -> 3:timestamp, DateColAddIntervalDayTimeScalar(col 0, val 99 11:22:33.123456789) -> 4:timestamp, DateColAddIntervalDayTimeScalar(col 0, val -99 11:22:33.123456789) -> 5:timestamp, IntervalDayTimeScalarAddDateColumn(val -99 11:22:33.123456789, col 0) -> 6:timestamp, IntervalDayTimeScalarAddDateColumn(val 99 11:22:33.123456789, col 0) -> 7:timestamp Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: date) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: Uniform Hash IS false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: timestamp), _col2 (type: timestamp), _col3 (type: timestamp), _col4 (type: timestamp), _col5 (type: timestamp), _col6 (type: timestamp) Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: date), VALUE._col0 (type: timestamp), VALUE._col1 (type: timestamp), VALUE._col2 (type: timestamp), VALUE._col3 (type: timestamp), VALUE._col4 (type: timestamp), VALUE._col5 (type: timestamp) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6] Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -718,7 +916,7 @@ dateval _c1 _c2 _c3 _c4 _c5 _c6 9075-06-13 9075-03-05 11:37:26.876543211 9075-09-20 11:22:33.123456789 9075-09-20 11:22:33.123456789 9075-03-05 11:37:26.876543211 9075-03-05 11:37:26.876543211 9075-09-20 11:22:33.123456789 9209-11-11 9209-08-03 13:37:26.876543211 9210-02-18 11:22:33.123456789 9210-02-18 11:22:33.123456789 9209-08-03 13:37:26.876543211 9209-08-03 13:37:26.876543211 9210-02-18 11:22:33.123456789 9403-01-09 9402-10-01 13:37:26.876543211 9403-04-18 12:22:33.123456789 9403-04-18 12:22:33.123456789 9402-10-01 13:37:26.876543211 9402-10-01 13:37:26.876543211 9403-04-18 12:22:33.123456789 -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select dateval, tsval, @@ -728,7 +926,7 @@ select from interval_arithmetic_1 order by dateval PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select dateval, tsval, @@ -739,6 +937,10 @@ from interval_arithmetic_1 order by dateval POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -756,26 +958,61 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: dateval (type: date), tsval (type: timestamp), (dateval - tsval) (type: interval_day_time), (tsval - dateval) (type: interval_day_time), (tsval - tsval) (type: interval_day_time) outputColumnNames: _col0, _col1, _col2, _col3, _col4 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4] + selectExpressions: DateColSubtractTimestampColumn(col 0, col 1) -> 2:interval_day_time, TimestampColSubtractDateColumn(col 1, col 0) -> 3:interval_day_time, TimestampColSubtractTimestampColumn(col 1, col 1) -> 4:interval_day_time Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: date) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: Uniform Hash IS false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: timestamp), _col2 (type: interval_day_time), _col3 (type: interval_day_time), _col4 (type: interval_day_time) Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: date), VALUE._col0 (type: timestamp), VALUE._col1 (type: interval_day_time), VALUE._col2 (type: interval_day_time), VALUE._col3 (type: interval_day_time) outputColumnNames: _col0, _col1, _col2, _col3, _col4 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4] Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -861,7 +1098,7 @@ dateval tsval c2 c3 c4 9075-06-13 9075-06-13 16:20:09.218517797 -0 16:20:09.218517797 0 16:20:09.218517797 0 00:00:00.000000000 9209-11-11 9209-11-11 04:08:58.223768453 -0 04:08:58.223768453 0 04:08:58.223768453 0 00:00:00.000000000 9403-01-09 9403-01-09 18:12:33.547 -0 18:12:33.547000000 0 18:12:33.547000000 0 00:00:00.000000000 -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select tsval, tsval - interval '99 11:22:33.123456789' day to second, @@ -873,7 +1110,7 @@ select from interval_arithmetic_1 order by tsval PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select tsval, tsval - interval '99 11:22:33.123456789' day to second, @@ -886,6 +1123,10 @@ from interval_arithmetic_1 order by tsval POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -903,26 +1144,61 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: tsval (type: timestamp), (tsval - 99 11:22:33.123456789) (type: timestamp), (tsval - -99 11:22:33.123456789) (type: timestamp), (tsval + 99 11:22:33.123456789) (type: timestamp), (tsval + -99 11:22:33.123456789) (type: timestamp), (-99 11:22:33.123456789 + tsval) (type: timestamp), (99 11:22:33.123456789 + tsval) (type: timestamp) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [1, 2, 3, 4, 5, 6, 7] + selectExpressions: TimestampColSubtractIntervalDayTimeScalar(col 1, val 99 11:22:33.123456789) -> 2:timestamp, TimestampColSubtractIntervalDayTimeScalar(col 1, val -99 11:22:33.123456789) -> 3:timestamp, TimestampColAddIntervalDayTimeScalar(col 1, val 99 11:22:33.123456789) -> 4:timestamp, TimestampColAddIntervalDayTimeScalar(col 1, val -99 11:22:33.123456789) -> 5:timestamp, IntervalDayTimeScalarAddTimestampColumn(val -99 11:22:33.123456789, col 1) -> 6:timestamp, IntervalDayTimeScalarAddTimestampColumn(val 99 11:22:33.123456789, col 1) -> 7:timestamp Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: timestamp) sort order: + + Reduce Sink Vectorization: + className: VectorReduceSinkOperator + native: false + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true + nativeConditionsNotMet: Uniform Hash IS false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: timestamp), _col2 (type: timestamp), _col3 (type: timestamp), _col4 (type: timestamp), _col5 (type: timestamp), _col6 (type: timestamp) Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reducer 2 Execution mode: vectorized, llap + Reduce Vectorization: + enabled: true + enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true + groupByVectorOutput: true + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Reduce Operator Tree: Select Operator expressions: KEY.reducesinkkey0 (type: timestamp), VALUE._col0 (type: timestamp), VALUE._col1 (type: timestamp), VALUE._col2 (type: timestamp), VALUE._col3 (type: timestamp), VALUE._col4 (type: timestamp), VALUE._col5 (type: timestamp) outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5, _col6 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6] Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -1012,14 +1288,14 @@ tsval _c1 _c2 _c3 _c4 _c5 _c6 9075-06-13 16:20:09.218517797 9075-03-06 03:57:36.095061008 9075-09-21 03:42:42.341974586 9075-09-21 03:42:42.341974586 9075-03-06 03:57:36.095061008 9075-03-06 03:57:36.095061008 9075-09-21 03:42:42.341974586 9209-11-11 04:08:58.223768453 9209-08-03 17:46:25.100311664 9210-02-18 15:31:31.347225242 9210-02-18 15:31:31.347225242 9209-08-03 17:46:25.100311664 9209-08-03 17:46:25.100311664 9210-02-18 15:31:31.347225242 9403-01-09 18:12:33.547 9402-10-02 07:50:00.423543211 9403-04-19 06:35:06.670456789 9403-04-19 06:35:06.670456789 9402-10-02 07:50:00.423543211 9402-10-02 07:50:00.423543211 9403-04-19 06:35:06.670456789 -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select interval '99 11:22:33.123456789' day to second + interval '10 9:8:7.123456789' day to second, interval '99 11:22:33.123456789' day to second - interval '10 9:8:7.123456789' day to second from interval_arithmetic_1 limit 2 PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select interval '99 11:22:33.123456789' day to second + interval '10 9:8:7.123456789' day to second, interval '99 11:22:33.123456789' day to second - interval '10 9:8:7.123456789' day to second @@ -1027,6 +1303,10 @@ from interval_arithmetic_1 limit 2 POSTHOOK: type: QUERY Explain +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -1041,15 +1321,29 @@ STAGE PLANS: TableScan alias: interval_arithmetic_1 Statistics: Num rows: 50 Data size: 4800 Basic stats: COMPLETE Column stats: COMPLETE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1] Select Operator expressions: 109 20:30:40.246913578 (type: interval_day_time), 89 02:14:26.000000000 (type: interval_day_time) outputColumnNames: _col0, _col1 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [2, 3] + selectExpressions: ConstantVectorExpression(val 109 20:30:40.246913578) -> 2:interval_day_time, ConstantVectorExpression(val 89 02:14:26.000000000) -> 3:interval_day_time Statistics: Num rows: 50 Data size: 1200 Basic stats: COMPLETE Column stats: COMPLETE Limit Number of rows: 2 + Limit Vectorization: + className: VectorLimitOperator + native: true Statistics: Num rows: 2 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 2 Data size: 48 Basic stats: COMPLETE Column stats: COMPLETE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -1057,6 +1351,14 @@ STAGE PLANS: serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Stage: Stage-0 Fetch Operator http://git-wip-us.apache.org/repos/asf/hive/blob/16d28b34/ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out ---------------------------------------------------------------------- diff --git a/ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out b/ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out index 0bc0e4c..002d011 100644 --- a/ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out +++ b/ql/src/test/results/clientpositive/llap/vector_interval_mapjoin.q.out @@ -136,7 +136,7 @@ POSTHOOK: Lineage: vectortab_b_1korc.si SIMPLE [(vectortab_b_1k)vectortab_b_1k.F POSTHOOK: Lineage: vectortab_b_1korc.t SIMPLE [(vectortab_b_1k)vectortab_b_1k.FieldSchema(name:t, type:tinyint, comment:null), ] POSTHOOK: Lineage: vectortab_b_1korc.ts SIMPLE [(vectortab_b_1k)vectortab_b_1k.FieldSchema(name:ts, type:timestamp, comment:null), ] POSTHOOK: Lineage: vectortab_b_1korc.ts2 SIMPLE [(vectortab_b_1k)vectortab_b_1k.FieldSchema(name:ts2, type:timestamp, comment:null), ] -PREHOOK: query: explain +PREHOOK: query: explain vectorization expression select v1.s, v2.s, @@ -158,7 +158,7 @@ join on v1.intrvl1 = v2.intrvl2 and v1.s = v2.s PREHOOK: type: QUERY -POSTHOOK: query: explain +POSTHOOK: query: explain vectorization expression select v1.s, v2.s, @@ -180,6 +180,10 @@ join on v1.intrvl1 = v2.intrvl2 and v1.s = v2.s POSTHOOK: type: QUERY +PLAN VECTORIZATION: + enabled: true + enabledConditionsMet: [hive.vectorized.execution.enabled IS true] + STAGE DEPENDENCIES: Stage-1 is a root stage Stage-0 depends on stages: Stage-1 @@ -197,12 +201,24 @@ STAGE PLANS: TableScan alias: vectortab_a_1korc Statistics: Num rows: 1000 Data size: 460264 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] Filter Operator + Filter Vectorization: + className: VectorFilterOperator + native: true + predicateExpression: FilterExprAndExpr(children: SelectColumnIsNotNull(col 8) -> boolean, SelectColumnIsNotNull(col 14)(children: DateColSubtractDateColumn(col 12, col 13)(children: CastTimestampToDate(col 10) -> 13:date) -> 14:timestamp) -> boolean) -> boolean predicate: (s is not null and (dt - CAST( ts AS DATE)) is not null) (type: boolean) Statistics: Num rows: 1000 Data size: 460264 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: s (type: string), (dt - CAST( ts AS DATE)) (type: interval_day_time) outputColumnNames: _col0, _col1 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [8, 14] + selectExpressions: DateColSubtractDateColumn(col 12, col 13)(children: CastTimestampToDate(col 10) -> 13:date) -> 14:timestamp Statistics: Num rows: 1000 Data size: 460264 Basic stats: COMPLETE Column stats: NONE Map Join Operator condition map: @@ -210,6 +226,10 @@ STAGE PLANS: keys: 0 _col0 (type: string), _col1 (type: interval_day_time) 1 _col0 (type: string), _col1 (type: interval_day_time) + Map Join Vectorization: + className: VectorMapJoinInnerBigOnlyMultiKeyOperator + native: true + nativeConditionsMet: hive.vectorized.execution.mapjoin.native.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS true, No nullsafe IS true, Supports Key Types IS true, Not empty key IS true, When Fast Hash Table, then requires no Hybrid Hash Join IS true, Small table vectorizes IS true outputColumnNames: _col0, _col1, _col2 input vertices: 1 Map 2 @@ -217,9 +237,16 @@ STAGE PLANS: Select Operator expressions: _col0 (type: string), _col2 (type: string), _col1 (type: interval_day_time) outputColumnNames: _col0, _col1, _col2 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [8, 8, 14] Statistics: Num rows: 1100 Data size: 506290 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false + File Sink Vectorization: + className: VectorFileSinkOperator + native: false Statistics: Num rows: 1100 Data size: 506290 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat @@ -227,25 +254,57 @@ STAGE PLANS: serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: false + usesVectorUDFAdaptor: false + vectorized: true Map 2 Map Operator Tree: TableScan alias: vectortab_b_1korc Statistics: Num rows: 1000 Data size: 458448 Basic stats: COMPLETE Column stats: NONE + TableScan Vectorization: + native: true + projectedOutputColumns: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12] Filter Operator + Filter Vectorization: + className: VectorFilterOperator + native: true + predicateExpression: FilterExprAndExpr(children: SelectColumnIsNotNull(col 8) -> boolean, SelectColumnIsNotNull(col 14)(children: DateColSubtractDateColumn(col 12, col 13)(children: CastTimestampToDate(col 10) -> 13:date) -> 14:timestamp) -> boolean) -> boolean predicate: (s is not null and (dt - CAST( ts AS DATE)) is not null) (type: boolean) Statistics: Num rows: 1000 Data size: 458448 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: s (type: string), (dt - CAST( ts AS DATE)) (type: interval_day_time) outputColumnNames: _col0, _col1 + Select Vectorization: + className: VectorSelectOperator + native: true + projectedOutputColumns: [8, 14] + selectExpressions: DateColSubtractDateColumn(col 12, col 13)(children: CastTimestampToDate(col 10) -> 13:date) -> 14:timestamp Statistics: Num rows: 1000 Data size: 458448 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string), _col1 (type: interval_day_time) sort order: ++ Map-reduce partition columns: _col0 (type: string), _col1 (type: interval_day_time) + Reduce Sink Vectorization: + className: VectorReduceSinkMultiKeyOperator + native: true + nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true, No buckets IS true, No TopN IS true, Uniform Hash IS true, No DISTINCT columns IS true, BinarySortableSerDe for keys IS true, LazyBinarySerDe for values IS true Statistics: Num rows: 1000 Data size: 458448 Basic stats: COMPLETE Column stats: NONE Execution mode: vectorized, llap LLAP IO: all inputs + Map Vectorization: + enabled: true + enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS true + groupByVectorOutput: true + inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat + allNative: true + usesVectorUDFAdaptor: false + vectorized: true Stage: Stage-0 Fetch Operator