hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt McCline (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
Date Tue, 04 Oct 2016 10:35:20 GMT

     [ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matt McCline updated HIVE-11394:
--------------------------------
    Attachment: HIVE-11394.03.patch

> Enhance EXPLAIN display for vectorization
> -----------------------------------------
>
>                 Key: HIVE-11394
>                 URL: https://issues.apache.org/jira/browse/HIVE-11394
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>         Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, HIVE-11394.03.patch
>
>
> Add detail to the EXPLAIN output showing why a Map or Reduce task was not vectorized.
> Add new VECTORIZATION option that displays 3 levels.  Here are some examples:
> (At the beginning)
> {code}
> PLAN VECTORIZATION:
>   enabled: true
>   enabledConditionsMet: [hive.vectorized.execution.enabled IS true]
> {code}
> For Map and Reduce nodes:
> {code}
>             Map Vectorization:
>                 enabled: true
>                 enabledConditionsMet: hive.vectorized.use.vectorized.input.format IS
true
>                 groupByVectorOutput: false
>                 inputFileFormats: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                 allNative: false
>                 usesVectorUDFAdaptor: false
>                 vectorized: true
> {code}
> {code}
>             Reduce Vectorization:
>                 enabled: true
>                 enableConditionsMet: hive.vectorized.execution.reduce.enabled IS true,
hive.execution.engine tez IN [tez, spark] IS true
>                 notVectorizedReason: Aggregation Function UDF avg parameter expression
for GROUPBY operator: Data type struct<count:bigint,sum:decimal(38,18),input:decimal(38,18)>
of Column[VALUE._col3] not supported
>                 vectorized: false
> {code}
> And, for each vectorized operator:
> {code}
>                     Select Vectorization:
>                         className: VectorSelectOperator
>                         native: true
>                         nativeConditionsMet: Supported IS true
>                         selectExpressions: IdentityExpression[6:decimal(38,18)]
>                         vectorized: true
> {code}
> {code}
>                       Map Join Vectorization:
>                           className: VectorMapJoinOperator
>                           native: false
>                           nativeConditionsMet: hive.vectorized.execution.mapjoin.native.enabled
IS true, hive.execution.engine tez IN [tez, spark] IS true, One MapJoin Condition IS true,
No nullsafe IS true, Supports Key Types IS true, When Fast Hash Table, then requires no Hybrid
Hash Join IS true, Small table vectorizes IS true
>                           nativeConditionsNotMet: Not empty key IS false
>                           vectorized: true
> {code}
> The standard @Explain Annotation Type is used.  A new 'vectorization' annotation marks
each new class and method.
> Works for FORMATTED, like other non-vectorization variations.
> Consider adding options to just show Vectorization information:
> EXPLAIN VECTORIZATION [ONLY] [SUMMARY|DETAIL]
> where current patch is equivalent to EXPLAIN VECTORIZATION DETAIL.
> SUMMARY would add PLAN VECTORIZATION and Map/Reduce Vectorization, but not operator detail.
> ONLY would suppress most non-vectorization elements.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message