hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Bian <>
Subject Condition for doing a sort merge bucket map join
Date Tue, 22 May 2012 15:07:38 GMT
Hi ,
I've got 7 large tables to join(each ~10G in size) into one table, all with
the same* 2 *join keys, I've read some documents on sort merge bucket map
join, but failed to fire that.
I've bucketed all the 7 tables into 20 buckets and sorted  by one of the
join key,
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;
Set the above parameters while doing the join.
What else do I miss? Do I have to bucket on both of the join keys(I'm
currently trying this)? And does each bucket file has to be smaller than
one HDFS block?
Thanks a lot.

View raw message