hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsha HN <99harsha.h....@gmail.com>
Subject Re: Question on MAPJOIN Vs JOIN performance
Date Wed, 22 Apr 2015 07:13:56 GMT
Hi,

Thanks for your reply. I will go through the link.
By the way my hive version is 0.12

Thanks,
Harsha

On Fri, Apr 17, 2015 at 4:16 AM, Lefty Leverenz <leftyleverenz@gmail.com>
wrote:

> Harsha, that document is from 2010.  What version of Hive are you using?
>
> Here's some up-to-date information in the Hive wiki:  Join Optimimzation
> <https://cwiki.apache.org/confluence/display/Hive/LanguageManual+JoinOptimization>
> .
>
> -- Lefty
>
> On Thu, Apr 16, 2015 at 2:38 AM, Harsha HN <99harsha.h.n99@gmail.com>
> wrote:
>
>> Hi All,
>>
>> I went through below mentioned Facebook engineering page,
>> https://www.facebook.com/notes/facebook-engineering/join
>> -optimization-in-apache-hive/470667928919
>>
>> I set following for auto conversion of joins,
>> set hive.auto.convert.join=true;
>> set hive.mapjoin.smalltable.filesize=1000000000;    (1GB)
>>
>> I observed some queries performed 2X faster in MAP JOIN as opposed to
>> Common join
>> and also instances where MAP JOIN is 3X slower than Common Join.
>>
>> Any thoughts on what might be slowing down MAP JOIN in some cases ?
>>
>> I have 40 Node cluster, so I have huge RAM available.
>>
>> Thanks,
>> Harsha
>>
>
>

Mime
View raw message