hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Elliot West (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-12025) refactor bucketId generating code
Date Thu, 08 Oct 2015 11:23:26 GMT

    [ https://issues.apache.org/jira/browse/HIVE-12025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948498#comment-14948498
] 

Elliot West commented on HIVE-12025:
------------------------------------

This is definitely an improvement. It has broken a test but I suspect this is because the
{{bucket_id}} calculation in {{BucketIdResolverImpl}} is currently incorrect and thus the
expectations are also incorrect. This in itself underlines why this refactoring is desirable.

May I suggest the following patch to fix the failing test:

{code}
diff --git hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
index f81373e..5297c5d 100644
--- hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
+++ hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/mutate/worker/TestBucketIdResolverImpl.java
@@ -23,7 +23,7 @@
   public void testAttachBucketIdToRecord() {
     MutableRecord record = new MutableRecord(1, "hello");
     capturingBucketIdResolver.attachBucketIdToRecord(record);
-    assertThat(record.rowId, is(new RecordIdentifier(-1L, 8, -1L)));
+    assertThat(record.rowId, is(new RecordIdentifier(-1L, 1, -1L)));
     assertThat(record.id, is(1));
     assertThat(record.msg.toString(), is("hello"));
   }
{code}

> refactor bucketId generating code
> ---------------------------------
>
>                 Key: HIVE-12025
>                 URL: https://issues.apache.org/jira/browse/HIVE-12025
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 1.0.1
>            Reporter: Eugene Koifman
>            Assignee: Eugene Koifman
>         Attachments: HIVE-12025.2.patch, HIVE-12025.patch
>
>
> HIVE-11983 adds ObjectInspectorUtils.getBucketHashCode() and getBucketNumber().
> There are several (at least) places in Hive that perform this computation:
> # ReduceSinkOperator.computeBucketNumber
> # ReduceSinkOperator.computeHashCode
> # BucketIdResolverImpl - only in 2.0.0 ASF line
> # FileSinkOperator.findWriterOffset
> # GenericUDFHash
> Should refactor it and make sure they all call methods from ObjectInspectorUtils.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message