hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From murali parimi <muralikrishna.par...@icloud.com>
Subject Re: Partitioned table and Bucket Map Join
Date Thu, 29 Jan 2015 13:51:31 GMT
I faced the same situation where two tables with 3 billion records on each side and partitioned,
sorted on same key. Set the following parameters in the hive query assuming the join will
happen in the map phase. 

set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
set hive.optimize.bucketmapjoin=true;
set hive.optimize.bucketmapjoin.sortedmerge=true;
set hive.enforce.sorting=true;

I am using hive version 13 and the storage format is Orc. One of the table is small in size
but I haven't checked whether irfan fit in the cache as we have huge memory. But the map sided
join didn't happen. What could be the reason?

Sent from my iPhone

> On Jan 29, 2015, at 7:38 AM, matshyeq <matshyeq@gmail.com> wrote:
> 
> I do have two tables partitioned on the same criteria.
> Could I still take advantage of Bucket Map Join or better, Sort Merge Bucket Map Join?
> How?
> 
> ~Maciek

Mime
View raw message