hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal V <go...@hortonworks.com>
Subject Re: Review Request 61087: HIVE-16965 SMB join may produce incorrect results
Date Mon, 24 Jul 2017 18:57:29 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/61087/#review181246
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValueInputMerger.java
Lines 66 (patched)
<https://reviews.apache.org/r/61087/#comment256769>

    IdentityHashMap - don't trust the hashCode() for KeyValueReader to be safe.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValueInputMerger.java
Line 109 (original), 130 (patched)
<https://reviews.apache.org/r/61087/#comment256770>

    Clear the prev and IOContext refs - interrupts do leave leaky state behind sometimes


- Gopal V


On July 24, 2017, 6:47 p.m., Deepak Jaiswal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/61087/
> -----------------------------------------------------------
> 
> (Updated July 24, 2017, 6:47 p.m.)
> 
> 
> Review request for hive, Gopal V, Jason Dere, and Sergey Shelukhin.
> 
> 
> Bugs: HIVE-16965
>     https://issues.apache.org/jira/browse/HIVE-16965
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Usually, in a JOIN with multiple inputs (partitions), the inputs are read sequentially,
however, incase of SMB join, the inputs are read based on key ordering. This invalidates the
current IOContext assumption that the input path once set wont change unless the input changes.
> This was resulting in incorrect partition information in results as it is derived from
the input path in IOContext.
> The new logic changes the input path as and when input changes.
> 
> 
> Diffs
> -----
> 
>   itests/src/test/resources/testconfiguration.properties f66e19be3e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordSource.java add7d08c40 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/KeyValueInputMerger.java 698fa7f69e

>   ql/src/test/queries/clientpositive/smb_join1.q PRE-CREATION 
>   ql/src/test/results/clientpositive/llap/smb_join1.q.out PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/61087/diff/1/
> 
> 
> Testing
> -------
> 
> Added a new test.
> 
> 
> Thanks,
> 
> Deepak Jaiswal
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message