hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Jaiswal <>
Subject Re: Review Request 64688: HIVE-18218
Date Sat, 10 Feb 2018 00:41:44 GMT

This is an automatically generated e-mail. To reply, visit:

(Updated Feb. 10, 2018, 12:41 a.m.)

Review request for hive, Ashutosh Chauhan and Jason Dere.


Added explain plan of the query with and without SMB. The one with SMB does shuffle join.

Repository: hive-git


Bucket based Join : Handle buckets with no splits.

The current logic in CustomPartitionVertex assumes that there is a split for each bucket whereas
in Tez, we can have no splits for empty buckets.
Also falls back to reduceside join if small table has more buckets than big table.

Disallow loading files in bucketed tables if the file name format is not like 000000_0, 000001_0_copy_1

Diffs (updated)

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ 26afe90faa 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ ef5e7edcd6

  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/ 9885038588 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/ dc698c8de8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/ 54f5bab6de 
  ql/src/test/queries/clientpositive/auto_sortmerge_join_16.q 8216b538c2 
  ql/src/test/results/clientpositive/llap/auto_sortmerge_join_16.q.out 91408df129 
  ql/src/test/results/clientpositive/spark/auto_sortmerge_join_16.q.out_spark 91408df129 





Deepak Jaiswal

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message