hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vineet Garg (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HIVE-21330) Bucketing id varies b/w data loaded through streaming apis and regular query
Date Wed, 27 Feb 2019 04:08:00 GMT
Vineet Garg created HIVE-21330:
----------------------------------

             Summary: Bucketing id varies b/w data loaded through streaming apis and regular
query
                 Key: HIVE-21330
                 URL: https://issues.apache.org/jira/browse/HIVE-21330
             Project: Hive
          Issue Type: Bug
            Reporter: Vineet Garg


The test at [https://github.com/apache/hive/blob/master/hcatalog/streaming/src/test/org/apache/hive/hcatalog/streaming/TestStreaming.java#L439] tests
for this case. It currently passes but for the wrong reason. This test checks for empty result
set. Result sets are empty due to prior INSERT failing to load data not because the bucketing
scheme is different.

This error with INSERT is fixed in https://github.com/apache/hive/pull/552. Test with this
patch fails because the underlying bucketing ids generated are different.

These tests are run on MR instead of TEZ  which could explain the different bucketing ids.
I don't really know what are the repercussion of having different bucketing ids and why are
they expected to be same but since there is a test to test this logic it is worth investigating
the case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message