hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hive/HiveQL/Transform" by ZhengShao
Date Wed, 21 Jan 2009 23:49:34 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by ZhengShao:
http://wiki.apache.org/hadoop/Hive/HiveQL/Transform

------------------------------------------------------------------------------
  
  Note that columns will be transformed to string and deliminated by TAB before feeding to
the user script, and the standard output of the user script will be treated as TAB-separated
string columns. User scripts can output debug information to standard error which will be
shown on the task detail page on hadoop.
  
+ In the syntax, both ''MAP'' and ''REDUCE'' can be also written as ''SELECT TRANSFORM''.
 There are actually no difference between these three.
+ Hive runs the reduce script in the reduce task (instead of the map task) because of the
''clusterBy''/''distributeBy''/''sortBy'' clause in the inner query.
  
  {{{
  clusterBy: CLUSTER BY colName (, colName)*
@@ -26, +28 @@

    REDUCE expression (, expression)*
      USING 'my_reduce_script'
      ( AS colName (, colName)* )?
- 
  }}}
  
- Both ''MAP'' and ''REDUCE'' can be also written as ''SELECT TRANSFORM''.  There are actually
no difference between these three.
- Hive runs the reduce script in the reduce task because of the ''clusterBy''/''distributeBy''/''sortBy''
clause.
  
  ''clusterBy'' is a short-cut for both ''distributeBy'' and ''sortBy''.
  

Mime
View raw message