hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/Joins" by AdamKramer
Date Wed, 26 Aug 2009 21:41:07 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by AdamKramer:
http://wiki.apache.org/hadoop/Hive/LanguageManual/Joins

------------------------------------------------------------------------------
    ON (a.key=b.key AND b.ds='2009-07-07' AND a.ds='2009-07-07')
  }}}
    ...the result is that the output of the join is pre-filtered, and you won't get post-filtering
trouble for rows that have a valid a.key but no matching b.key. The same logic applies to
RIGHT and FULL joins.
+  * Joins are NOT commutative! Joins are left-associative regardless of whether they are
LEFT or RIGHT joins.
+ {{{
+   SELECT a.val1, a.val2, b.val, c.val
+   FROM a
+   JOIN b ON (a.key = b.key)
+   LEFT OUTER JOIN c ON (a.key = c.key)
+ }}}
+   ...first joins a on b, throwing away everything in a or b that does not have a corresponding
key in the other table. The reduced table is then joined on c. This provides unintuitive results
if there is a key that exists in both a and c, but not b: The whole row (including a.val1,
a.val2, and a.key) is dropped in the "a JOIN b" step, so when the result of that is joined
with c, any row with a c.key that had a corresponding a.key or b.key (but not both) will show
up as NULL, NULL, NULL, c.val.
  

Mime
View raw message