hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Prasanth Jayachandran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12025) refactor bucketId generating code
Date Wed, 07 Oct 2015 22:23:27 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947706#comment-14947706
] 

Prasanth Jayachandran commented on HIVE-12025:
----------------------------------------------

The changes introduced in this patch in BucketIdResolverImpl is the correct way to compute
bucket number. ReduceSinkOperator had a bug in bucket number computation regarding negative
hashcodes (multiplying by -1 vs mast with Int.MAX). There might be some test failures related
to this change but that is the expected change. Since these are util methods, it will be good
to have unit tests for these (if one doesnot exist).

Other than that, lgtm +1. Pending tests.

> refactor bucketId generating code
> ---------------------------------
>
>                 Key: HIVE-12025
>                 URL: https://issues.apache.org/jira/browse/HIVE-12025
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.0.1
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-12025.2.patch, HIVE-12025.patch
>
>
> HIVE-11983 adds ObjectInspectorUtils.getBucketHashCode() and getBucketNumber().
> There are several (at least) places in Hive that perform this computation:
> # ReduceSinkOperator.computeBucketNumber
> # ReduceSinkOperator.computeHashCode
> # BucketIdResolverImpl - only in 2.0.0 ASF line
> # FileSinkOperator.findWriterOffset
> # GenericUDFHash
> Should refactor it and make sure they all call methods from ObjectInspectorUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message