hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/UDF" by Ning Zhang
Date Fri, 13 Aug 2010 08:15:38 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/UDF" page has been changed by Ning Zhang.
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=48&rev2=49

--------------------------------------------------

  ||array<struct `{'x','y'}`>|| histogram_numeric(col, b) || Computes a histogram of
a numeric column in the group using b non-uniformly spaced bins. The output is an array of
size b of double-valued (x,y) coordinates that represent the bin centers and heights ||
  
  == Built-in Table-Generating Functions (UDTF) ==
- <<Anchor(UDTF)>> Normal user-defined functions, such as concat(), take in a
single input row and output a single output row. In contrast, table-generating functions transform
a single input row to multiple output rows. Currently, the only table-generating function
is explode(), which takes in an array as an input and outputs the elements of the array as
separate rows. UDTF's can be used in the SELECT expression list and as a part of LATERAL VIEW.
+ <<Anchor(UDTF)>> Normal user-defined functions, such as concat(), take in a
single input row and output a single output row. In contrast, table-generating functions transform
a single input row to multiple output rows. 
+ 
+ === explode ===
+ 
+ explode() takes in an array as an input and outputs the elements of the array as separate
rows. UDTF's can be used in the SELECT expression list and as a part of LATERAL VIEW.
  
  An example use of explode() in the SELECT expression list is as follows:
  
@@ -291, +295 @@

  ||<10%>Return Type''' ''' ||<10%>Name(Signature)''' ''' ||Description''' '''
||
  || myType ||explode(array<myType> a) <<Anchor(explode)>> ||For each element
in a, explode() generates a row containing that element ||
  
+ === json_tuple ===
+ A new json_tuple() UDTF is introduced in hive 0.7. It takes a set of names (keys) and return
a tuple of values in one function.
+ If you are using get_json_object() and want to replace it with json_tuple, the only changes
is that your query will be using json_tuple() in lateral view rather than multiple get_json_object()
in the select clause. 
+ 
+ For example, 
+ {{{
+ select a.timestamp, get_json_object(a.appevents, '$.eventid'), get_json_object(a.appenvets,
'$.eventname') from log a;
+ }}}
+ should be changed to 
+ {{{
+ select a.timestamp, b.*
+ from log a lateral view json_tuple(a.appevent, 'eventid', 'eventname') b as f1, f2;
+ }}}
+ 

Mime
View raw message