hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ash...@gmail.com
Subject Re: (HIVE-1642) Convert join queries to map-join based on size of table/row
Date Mon, 27 Dec 2010 07:42:48 GMT
Thanks for the reply. I want to get clarification on this feature.

If one of the two joining tables table t1 is smaller than 25M and is
sharded, how does this feature work?

Suppose there are joins on multiple tables such as t1, t2 and t3. If t1 and
t2 are smaller than 25M  and co-located with joining key, you can stream t3
and map join them. What if t1 and t2 are not co-located with join keys? What
if t1 and t2 together are bigger than 25M?

Thanks in advance!
On Thu, Dec 23, 2010 at 7:16 PM, Liyin Tang <liyintang@gmail.com> wrote:

> Hi,
> How large is t1 and t2 ?
> if both of t1 and t2 is larger than 25M (a default threshold), the query
> processor will do the common join.
>
> Thanks
> Liyin
>
> On 23 December 2010 18:50, <ashu99@gmail.com> wrote:
>
> > Hi,
> >
> > I set hive.auto.convert.join=true and run the following query:
> >
> > select t1.foo, count(t2.bar) from invites t1 join invites t2 on
> > (t1.foo=t2.foo) group by t1.foo;
> >
> > I did not see it ran as map side join. Did I miss something? Is there any
> > precondition for this feature to work?
> >
> > Thanks.
> >
>
>
>
> --
> Best Regards
> -Liyin
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message