hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/Joins" by NamitJain
Date Wed, 31 Mar 2010 21:14:14 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/Joins" page has been changed by NamitJain.


     SELECT a.key, a.val
     FROM a LEFT SEMI JOIN b on (a.key = b.key)
+  * If the tables being joined are bucketized, and the buckets are a multiple of each other,
the buckets can be joined with each other. If table A has 8 buckets are table B has 4 buckets,
the following join  
+ {{{
+   SELECT /*+ MAPJOIN(b) */ a.key, a.value
+   FROM a join b on a.key = b.key
+ }}}
+ will be done on the mapper only. Instead of fetching B completely for each mapper of A,
only the required buckets are fetched. For the query above, the mapper processing bucket 1
for A will only fetch bucket 1 of B

View raw message