hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-15573) Vectorization: ACID shuffle ReduceSink is not specialized
Date Mon, 23 Jan 2017 07:57:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-15573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15834024#comment-15834024
] 

Gopal V commented on HIVE-15573:
--------------------------------

bq. there are cases where VectorExtractRow has to set Hive writable objects so the Java hash
code can be obtained.

The "easy case" is int/bigint, where hashCode() is an identity function.

cross-check this section, please for bucketFieldValues & partitionFieldValues? NPE?

{code}
+          partitionVectorExtractRow.extractRow(batch, batchIndex, partitionFieldValues);
+          hashCode = ObjectInspectorUtils.getBucketHashCode(bucketFieldValues, partitionObjectInspectors);
{code}


> Vectorization: ACID shuffle ReduceSink is not specialized 
> ----------------------------------------------------------
>
>                 Key: HIVE-15573
>                 URL: https://issues.apache.org/jira/browse/HIVE-15573
>             Project: Hive
>          Issue Type: Improvement
>          Components: Transactions, Vectorization
>    Affects Versions: 2.2.0
>            Reporter: Gopal V
>            Assignee: Matt McCline
>             Fix For: 2.2.0
>
>         Attachments: HIVE-15573.01.patch, HIVE-15573.02.patch, screenshot-1.png
>
>
> The ACID shuffle disabled murmur hash for the shuffle, due to the bucketing requirements
demanding the writable hashcode for the shuffles.
> {code}
>     boolean useUniformHash = desc.getReducerTraits().contains(UNIFORM);
>     if (!useUniformHash) {
>       return false;
>     }
> {code}
> This check protects the fast ReduceSink ops from being used in ACID inserts.
> A specialized case for the following pattern will make ACID insert much faster.
> {code}
>                     Reduce Output Operator
>                       sort order: 
>                       Map-reduce partition columns: _col0 (type: bigint)
>                       value expressions:  ....
> {code}
> !screenshot-1.png!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message