hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "Hive/LanguageManual/UDF" by PhiloVivero
Date Wed, 20 Apr 2011 17:42:59 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/UDF" page has been changed by PhiloVivero.
The comment on this change is: Fixed reformulation, generalised it, specified a case whose
reformulation is unknown..
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=63&rev2=64

--------------------------------------------------

  
  == GROUPing and SORTing on f(column) ==
  
- If you would like to GROUP BY or SORT BY a column on which you've applied a function, like
this:
+ A typical OLAP pattern is that you have a timestamp column and you want to group by daily
or other less granular date windows than by second. So you might want to select concat(year(dt),month(dt))
and then group on that concat(). But if you attempt to GROUP BY or SORT BY a column on which
you've applied a function, like this:
  
  {{{
  select f(col) as fc, count(*) from table_name group by fc
@@ -341, +341 @@

  Because you are not able to GROUP BY or SORT BY a column on which a function has been applied.
However, you can reformulate this query with subqueries:
  
  {{{
- select sq.fc,count(*) from (select f(col) as fc) sq group by sq.fc
+ select sq.fc,count(*) from (select f(col) as fc from table_name) sq group by sq.fc
  }}}
  
+ You will have to specify all the columns you want along with the f(col) in both the subquery
and the outside (which is obvious on retrospect). The general formula for the f(col) reformulation
is:
+ 
+ {{{
+ select sq.fc,col1,col2,...,colN,count(*) from
+   (select f(col) as fc,col1,col2,...,colN from table_name) sq
+  group by sq.fc,col1,col2,...,colN
+ }}}
+ 
+ Contact Tim Ellis (tellis) at RiotGames dot com if you would like to discuss this in further
detail.
+ 

Mime
View raw message