hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
Date Thu, 13 Oct 2016 04:40:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15570817#comment-15570817
] 

Hive QA commented on HIVE-11394:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12833021/HIVE-11394.092.patch

{color:green}SUCCESS:{color} +1 due to 162 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 10530 tests executed
*Failed tests:*
{noformat}
TestMiniLlapLocalCliDriver-orc_llap.q-delete_where_non_partitioned.q-vector_groupby_mapjoin.q-and-27-more
- did not produce a TEST-*.xml file
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[reloadJar]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_decimal_udf]
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_udf1]
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[vectorization_limit]
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/1518/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/1518/console
Test logs: http://ec2-204-236-174-241.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-Build-1518/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12833021 - PreCommit-HIVE-Build

> Enhance EXPLAIN display for vectorization
> -----------------------------------------
>
>                 Key: HIVE-11394
>                 URL: https://issues.apache.org/jira/browse/HIVE-11394
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, HIVE-11394.03.patch, HIVE-11394.04.patch,
HIVE-11394.05.patch, HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, HIVE-11394.09.patch,
HIVE-11394.091.patch, HIVE-11394.092.patch
>
>
> Add detail to the EXPLAIN output showing why a Map and Reduce work is not vectorized.
> New syntax is: EXPLAIN VECTORIZATION \[ONLY\] \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\]
> The ONLY option suppresses most non-vectorization elements.
> SUMMARY shows vectorization information for the PLAN (is vectorization enabled) and a
summary of Map and Reduce work.
> OPERATOR shows vectorization information for operators.  E.g. Filter Vectorization. 
It includes all information of SUMMARY, too.
> EXPRESSION shows vectorization information for expressions.  E.g. predicateExpression.
 It includes all information of SUMMARY and OPERATOR, too.
> DETAIL shows very vectorization information.
> It includes all information of SUMMARY, OPERATOR, and EXPRESSION too.
> The optional clause defaults are not ONLY and SUMMARY.
> ---------------------------------------------------------------------------------------------------
> Here are some examples:
> EXPLAIN VECTORIZATION example:
> (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization sections)
> Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION SUMMARY.
> Under Reducer 3’s "Reduce Vectorization:" you’ll see
> notVectorizedReason: Aggregation Function UDF avg parameter expression for GROUPBY operator:
Data type struct<count:bigint,sum:double,input:int> of Column\[VALUE._col2\] not supported
> For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": "false"
which says a node has a GROUP BY with an AVG or some other aggregator that outputs a non-PRIMITIVE
type (e.g. STRUCT) and all downstream operators are row-mode.  I.e. not vector output.
> If "usesVectorUDFAdaptor:": "false" were true, it would say there was at least one vectorized
expression is using VectorUDFAdaptor.
> And, "allNative:": "false" will be true when all operators are native.  Today, GROUP
BY and FILE SINK are not native.  MAP JOIN and REDUCE SINK are conditionally native.  FILTER
and SELECT are native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> ...
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
>         Reducer 3 <- Reducer 2 (SIMPLE_EDGE)
> ...
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: alltypesorc
>                   Statistics: Num rows: 12288 Data size: 36696 Basic stats: COMPLETE
Column stats: COMPLETE
>                   Select Operator
>                     expressions: cint (type: int)
>                     outputColumnNames: cint
>                     Statistics: Num rows: 12288 Data size: 36696 Basic stats: COMPLETE
Column stats: COMPLETE
>                     Group By Operator
>                       keys: cint (type: int)
>                       mode: hash
>                       outputColumnNames: _col0
>                       Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE
Column stats: COMPLETE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: int)
>                         sort order: +
>                         Map-reduce partition columns: _col0 (type: int)
>                         Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE
Column stats: COMPLETE
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS
true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 2 
>             Execution mode: vectorized, llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true,
hive.execution.engine tez IN [tez, spark] IS true
>                 groupByVectorOutput: false
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>             Reduce Operator Tree:
>               Group By Operator
>                 keys: KEY._col0 (type: int)
>                 mode: mergepartial
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 5775 Data size: 17248 Basic stats: COMPLETE Column
stats: COMPLETE
>                 Group By Operator
>                   aggregations: sum(_col0), count(_col0), avg(_col0), std(_col0)
>                   mode: hash
>                   outputColumnNames: _col0, _col1, _col2, _col3
>                   Statistics: Num rows: 1 Data size: 172 Basic stats: COMPLETE Column
stats: COMPLETE
>                   Reduce Output Operator
>                     sort order: 
>                     Statistics: Num rows: 1 Data size: 172 Basic stats: COMPLETE Column
stats: COMPLETE
>                     value expressions: _col0 (type: bigint), _col1 (type: bigint), _col2
(type: struct<count:bigint,sum:double,input:int>), _col3 (type: struct<count:bigint,sum:double,variance:double>)
>         Reducer 3 
>             Execution mode: llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true,
hive.execution.engine tez IN [tez, spark] IS true
>                 notVectorizedReason: Aggregation Function UDF avg parameter expression
for GROUPBY operator: Data type struct<count:bigint,sum:double,input:int> of Column[VALUE._col2]
not supported
>                 vectorized: false
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: sum(VALUE._col0), count(VALUE._col1), avg(VALUE._col2),
std(VALUE._col3)
>                 mode: mergepartial
>                 outputColumnNames: _col0, _col1, _col2, _col3
>                 Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column stats:
COMPLETE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 1 Data size: 32 Basic stats: COMPLETE Column
stats: COMPLETE
>                   table:
>                       input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink 
> {code}
> EXPLAIN VECTORIZATION OPERATOR
> Notice the added  TableScan Vectorization, Select Vectorization, Group By Vectorization,
Map Join Vectorizatin, Reduce Sink Vectorization sections in this example.
> Notice the nativeConditionsMet detail on why Reduce Vectorization is native.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> #### A masked pattern was here ####
>       Edges:
>         Map 2 <- Map 1 (BROADCAST_EDGE)
>         Reducer 3 <- Map 2 (SIMPLE_EDGE)
> #### A masked pattern was here ####
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: a
>                   Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column
stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
> predicate: c2 is not null (type: boolean)
>                     Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column
stats: NONE
>                     Select Operator
>                       expressions: c1 (type: int), c2 (type: char(10))
>                       outputColumnNames: _col0, _col1
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0, 1]
>                       Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE Column
stats: NONE
>                       Reduce Output Operator
>                         key expressions: _col1 (type: char(20))
>                         sort order: +
>                         Map-reduce partition columns: _col1 (type: char(20))
>                         Reduce Sink Vectorization:
>                             className: VectorReduceSinkStringOperator
>                             native: true
>                             nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true,
No buckets IS true, No TopN IS true, Uniform Hash IS true, No DISTINCT columns IS true, BinarySortableSerDe
for keys IS true, LazyBinarySerDe for values IS true
>                         Statistics: Num rows: 3 Data size: 294 Basic stats: COMPLETE
Column stats: NONE
>                         value expressions: _col0 (type: int)
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS
true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: true
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Map 2 
>             Map Operator Tree:
>                 TableScan
>                   alias: b
>                   Statistics: Num rows: 3 Data size: 324 Basic stats: COMPLETE Column
stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
> predicate: c2 is not null (type: boolean)
>                     Statistics: Num rows: 3 Data size: 324 Basic stats: COMPLETE Column
stats: NONE
>                     Select Operator
>                       expressions: c1 (type: int), c2 (type: char(20))
>                       outputColumnNames: _col0, _col1
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0, 1]
>                       Statistics: Num rows: 3 Data size: 324 Basic stats: COMPLETE Column
stats: NONE
>                       Map Join Operator
>                         condition map:
>                              Inner Join 0 to 1
>                         keys:
>                           0 _col1 (type: char(20))
>                           1 _col1 (type: char(20))
>                         Map Join Vectorization:
>                             className: VectorMapJoinInnerStringOperator
>                             native: true
>                             nativeConditionsMet: hive.vectorized.execution.mapjoin.native.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS true,
No nullsafe IS true, Supports Key Types IS true, Not empty key IS true, When Fast Hash Table,
then requires no Hybrid Hash Join IS true, Small table vectorizes IS true
>                         outputColumnNames: _col0, _col1, _col2, _col3
>                         input vertices:
>                           0 Map 1
>                         Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE
Column stats: NONE
>                         Reduce Output Operator
>                           key expressions: _col0 (type: int)
>                           sort order: +
>                           Reduce Sink Vectorization:
>                               className: VectorReduceSinkOperator
>                               native: false
>                               nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true,
No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for
keys IS true, LazyBinarySerDe for values IS true
>                               nativeConditionsNotMet: Uniform Hash IS false
>                           Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE
Column stats: NONE
>                           value expressions: _col1 (type: char(10)), _col2 (type: int),
_col3 (type: char(20))
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS
true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 3 
>             Execution mode: vectorized, llap
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true,
hive.execution.engine tez IN [tez, spark] IS true
>                 groupByVectorOutput: true
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>             Reduce Operator Tree:
>               Select Operator
>                 expressions: KEY.reducesinkkey0 (type: int), VALUE._col0 (type: char(10)),
VALUE._col1 (type: int), VALUE._col2 (type: char(20))
>                 outputColumnNames: _col0, _col1, _col2, _col3
>                 Select Vectorization:
>                     className: VectorSelectOperator
>                     native: true
>                     projectedOutputColumns: [0, 1, 2, 3]
>                 Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE Column stats:
NONE
>                 File Output Operator
>                   compressed: false
>                   File Sink Vectorization:
>                       className: VectorFileSinkOperator
>                       native: false
>                   Statistics: Num rows: 3 Data size: 323 Basic stats: COMPLETE Column
stats: NONE
>                   table:
>                       input format: org.apache.hadoop.mapred.SequenceFileInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
>  {code}
> EXPLAIN VECTORIZATION EXPRESSION
> Notice the predicateExpression in this example.
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Tez
> #### A masked pattern was here ####
>       Edges:
>         Reducer 2 <- Map 1 (SIMPLE_EDGE)
> #### A masked pattern was here ####
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: vector_interval_2
>                   Statistics: Num rows: 2 Data size: 788 Basic stats: COMPLETE Column
stats: NONE
>                   TableScan Vectorization:
>                       native: true
>                       projectedOutputColumns: [0, 1, 2, 3, 4, 5]
>                   Filter Operator
>                     Filter Vectorization:
>                         className: VectorFilterOperator
>                         native: true
>                         predicateExpression: FilterExprAndExpr(children: FilterTimestampScalarEqualTimestampColumn(val
2001-01-01 01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000)
-> 6:timestamp) -> boolean, FilterTimestampScalarNotEqualTimestampColumn(val 2001-01-01
01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000)
-> 6:timestamp) -> boolean, FilterTimestampScalarLessEqualTimestampColumn(val 2001-01-01
01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000)
-> 6:timestamp) -> boolean, FilterTimestampScalarLessTimestampColumn(val 2001-01-01
01:02:03.0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000)
-> 6:timestamp) -> boolean, FilterTimestampScalarGreaterEqualTimestampColumn(val 2001-01-01
01:02:03.0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000)
-> 6:timestamp) -> boolean, FilterTimestampScalarGreaterTimestampColumn(val 2001-01-01
01:02:03.0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000)
-> 6:timestamp) -> boolean, FilterTimestampColEqualTimestampScalar(col 6, val 2001-01-01
01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) ->
6:timestamp) -> boolean, FilterTimestampColNotEqualTimestampScalar(col 6, val 2001-01-01
01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) ->
6:timestamp) -> boolean, FilterTimestampColGreaterEqualTimestampScalar(col 6, val 2001-01-01
01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) ->
6:timestamp) -> boolean, FilterTimestampColGreaterTimestampScalar(col 6, val 2001-01-01
01:02:03.0)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) ->
6:timestamp) -> boolean, FilterTimestampColLessEqualTimestampScalar(col 6, val 2001-01-01
01:02:03.0)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000)
-> 6:timestamp) -> boolean, FilterTimestampColLessTimestampScalar(col 6, val 2001-01-01
01:02:03.0)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000)
-> 6:timestamp) -> boolean, FilterTimestampColEqualTimestampColumn(col 0, col 6)(children:
DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) -> 6:timestamp) ->
boolean, FilterTimestampColNotEqualTimestampColumn(col 0, col 6)(children: DateColAddIntervalDayTimeScalar(col
1, val 0 01:02:04.000000000) -> 6:timestamp) -> boolean, FilterTimestampColLessEqualTimestampColumn(col
0, col 6)(children: DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:03.000000000) ->
6:timestamp) -> boolean, FilterTimestampColLessTimestampColumn(col 0, col 6)(children:
DateColAddIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000) -> 6:timestamp) ->
boolean, FilterTimestampColGreaterEqualTimestampColumn(col 0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col
1, val 0 01:02:03.000000000) -> 6:timestamp) -> boolean, FilterTimestampColGreaterTimestampColumn(col
0, col 6)(children: DateColSubtractIntervalDayTimeScalar(col 1, val 0 01:02:04.000000000)
-> 6:timestamp) -> boolean) -> boolean
>                     predicate: ((2001-01-01 01:02:03.0 = (dt + 0 01:02:03.000000000))
and (2001-01-01 01:02:03.0 <> (dt + 0 01:02:04.000000000)) and (2001-01-01 01:02:03.0
<= (dt + 0 01:02:03.000000000)) and (2001-01-01 01:02:03.0 < (dt + 0 01:02:04.000000000))
and (2001-01-01 01:02:03.0 >= (dt - 0 01:02:03.000000000)) and (2001-01-01 01:02:03.0 >
(dt - 0 01:02:04.000000000)) and ((dt + 0 01:02:03.000000000) = 2001-01-01 01:02:03.0) and
((dt + 0 01:02:04.000000000) <> 2001-01-01 01:02:03.0) and ((dt + 0 01:02:03.000000000)
>= 2001-01-01 01:02:03.0) and ((dt + 0 01:02:04.000000000) > 2001-01-01 01:02:03.0)
and ((dt - 0 01:02:03.000000000) <= 2001-01-01 01:02:03.0) and ((dt - 0 01:02:04.000000000)
< 2001-01-01 01:02:03.0) and (ts = (dt + 0 01:02:03.000000000)) and (ts <> (dt +
0 01:02:04.000000000)) and (ts <= (dt + 0 01:02:03.000000000)) and (ts < (dt + 0 01:02:04.000000000))
and (ts >= (dt - 0 01:02:03.000000000)) and (ts > (dt - 0 01:02:04.000000000))) (type:
boolean)
>                     Statistics: Num rows: 1 Data size: 394 Basic stats: COMPLETE Column
stats: NONE
>                     Select Operator
>                       expressions: ts (type: timestamp)
>                       outputColumnNames: _col0
>                       Select Vectorization:
>                           className: VectorSelectOperator
>                           native: true
>                           projectedOutputColumns: [0]
>                       Statistics: Num rows: 1 Data size: 394 Basic stats: COMPLETE Column
stats: NONE
>                       Reduce Output Operator
>                         key expressions: _col0 (type: timestamp)
>                         sort order: +
>                         Reduce Sink Vectorization:
>                             className: VectorReduceSinkOperator
>                             native: false
>                             nativeConditionsMet: hive.vectorized.execution.reducesink.new.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true, Not ACID UPDATE or DELETE IS true,
No buckets IS true, No TopN IS true, No DISTINCT columns IS true, BinarySortableSerDe for
keys IS true, LazyBinarySerDe for values IS true
>                             nativeConditionsNotMet: Uniform Hash IS false
>                         Statistics: Num rows: 1 Data size: 394 Basic stats: COMPLETE
Column stats: NONE
>             Execution mode: vectorized, llap
>             LLAP IO: all inputs
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS
true
>                 groupByVectorOutput: true
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
>         Reducer 2 
> ... 
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' annotation marks
each new class and method.
> Works for FORMATTED, like other non-vectorization EXPLAIN variations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message