hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/Select" by RaghothamMurthy
Date Thu, 22 Jan 2009 20:03:51 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by RaghothamMurthy:
http://wiki.apache.org/hadoop/Hive/LanguageManual/Select

------------------------------------------------------------------------------
  SELECT col1, COUNT(col2), sum(col3) FROM t1 GROUP BY col1
  }}}
  
-  * [wiki:Self:Hive/LanguageManual/ClusterBy Cluster By]
+  * Cluster By. This construct is used mainly with the [wiki:Self:Hive/LanguageManual/Transform
MAP and REDUCE] clauses. But, it is sometimes useful in SELECT statements if there is a need
to partition and sort the output of a query for subsequent queries.
  {{{
  SELECT col1, col2 FROM t1 CLUSTER BY col1
  }}}
  
-  * [wiki:Self:Hive/LanguageManual/ClusterBy Distribute By and Sort By]
+  * Distribute By and Sort By. These constructs are mainly used with the [wiki:Self:Hive/LanguageManual/Transform
MAP and REDUCE] clauses. But, they can be used to distribute and sort the output of a query.
Sort By also supports ASC and DESC for ascending and descending order of sorting, but defaults
to ASC if nothing is specified.
  {{{
  SELECT col1, col2 FROM t1 DISTRIBUTE BY col1
  
- SELECT col1, col2 FROM t1 DISTRIBUTE BY col1 SORT BY col1, col2
+ SELECT col1, col2 FROM t1 DISTRIBUTE BY col1 SORT BY col1 ASC, col2 DESC
  }}}
  
   * Order By - Hive currently does not support ORDER BY. A similar effect can be gotten by
using SORT BY and setting number of reducers to 1. The following query does ORDER BY col1.
Note however that this query can take a long time if the size of t1 is large since there is
only one reducer.

Mime
View raw message