hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Jaiswal <>
Subject Re: Review Request 61087: HIVE-16965 SMB join may produce incorrect results
Date Tue, 25 Jul 2017 20:01:36 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated July 25, 2017, 8:01 p.m.)

Review request for hive, Gopal V, Jason Dere, and Sergey Shelukhin.


Use llap_smb.q as main test instead of smb_join1.q
Remove assert based on failing tests. Irrespective of number of splits the path is same.

Bugs: HIVE-16965

Repository: hive-git


Usually, in a JOIN with multiple inputs (partitions), the inputs are read sequentially, however,
incase of SMB join, the inputs are read based on key ordering. This invalidates the current
IOContext assumption that the input path once set wont change unless the input changes.
This was resulting in incorrect partition information in results as it is derived from the
input path in IOContext.
The new logic changes the input path as and when input changes.

Diffs (updated)

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ add7d08c40 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/ 698fa7f69e

  ql/src/test/results/clientpositive/llap/llap_smb.q.out 87b33db805 




Added a new test.


Deepak Jaiswal

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message