hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/Joins" by NamitJain
Date Wed, 31 Mar 2010 21:37:45 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/Joins" page has been changed by NamitJain.
http://wiki.apache.org/hadoop/Hive/LanguageManual/Joins?action=diff&rev1=18&rev2=19

--------------------------------------------------

    FROM a join b on a.key = b.key
  }}}
  can be done on the mapper only. Instead of fetching B completely for each mapper of A, only
the required buckets are fetched. For the query above, the mapper processing bucket 1 for
A will only fetch bucket 1 of B.
- It is not the default behavior, and is governed by the following parameter. '''set hive.optimize.bucketmapjoin
= true'''
+ It is not the default behavior, and is governed by the following parameter 
+ {{{
+   set hive.optimize.bucketmapjoin = true
+ }}}
   * If the tables being joined are sorted and bucketized, and the number of buckets are same,
a sort-merge join can be performed. The corresponding buckets are joined with each other at
the mapper. If both A and B have 4 buckets,
  {{{
    SELECT /*+ MAPJOIN(b) */ a.key, a.value

Mime
View raw message