hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hadoop n00b <new2h...@gmail.com>
Subject Re: Mapjoin Usage Question
Date Thu, 20 Jan 2011 07:23:01 GMT
Thanks Leo,

Does the smaller table go into the mapjoin hint? Actually, when I ran a test
query with the bigger table in the hint, it performed better.

On Thu, Jan 20, 2011 at 12:40 PM, Leo Alekseyev <dnquark@gmail.com> wrote:

> You can only specify one table, and make sure to include its name,
> i.e. /*+ mapjoin(t2)*/.   For more info see
> http://wiki.apache.org/hadoop/Hive/JoinOptimization and
> http://www.slideshare.net/aiolos127/join-optimization-in-hive.
>
> Also, you are using a relatively old version of Hive, but I'll let
> more experienced people on this list decide whether that's a problem
> :)
>
> On Thu, Jan 20, 2011 at 2:00 AM, hadoop n00b <new2hive@gmail.com> wrote:
> > Hi,
> >
> > How do I use the mapjoin hint in a query.
> >
> > Say, I have two tables t1 and t2 where t2 is the smaller table. Do I
> specify
> > t2 in the mapjoin hint?
> >
> > select /*+ mapjoin(b)*/ * from t1 join t2 b on (a.id = b.id)
> >
> > If I am joining two smaller tables, can I specify two clauses in the
> > mapjoin? /*+mapjoin(b,c)*/?
> >
> > I am unable to find much documentation on this. I am using CDH2 with Hive
> > 0.4.1
> >
> > Thanks!
>

Mime
View raw message