Return-Path: X-Original-To: apmail-hadoop-common-commits-archive@www.apache.org Delivered-To: apmail-hadoop-common-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6DC461480 for ; Wed, 20 Apr 2011 17:43:21 +0000 (UTC) Received: (qmail 99085 invoked by uid 500); 20 Apr 2011 17:43:21 -0000 Delivered-To: apmail-hadoop-common-commits-archive@hadoop.apache.org Received: (qmail 99053 invoked by uid 500); 20 Apr 2011 17:43:21 -0000 Mailing-List: contact common-commits-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-commits@hadoop.apache.org Received: (qmail 99046 invoked by uid 500); 20 Apr 2011 17:43:21 -0000 Delivered-To: apmail-hadoop-core-commits@hadoop.apache.org Received: (qmail 99043 invoked by uid 99); 20 Apr 2011 17:43:21 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Apr 2011 17:43:21 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.131] (HELO eos.apache.org) (140.211.11.131) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 20 Apr 2011 17:43:20 +0000 Received: from eos.apache.org (localhost [127.0.0.1]) by eos.apache.org (Postfix) with ESMTP id EA0E5D46; Wed, 20 Apr 2011 17:42:59 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Apache Wiki To: Apache Wiki Date: Wed, 20 Apr 2011 17:42:59 -0000 Message-ID: <20110420174259.37291.92166@eos.apache.org> Subject: =?utf-8?q?=5BHadoop_Wiki=5D_Update_of_=22Hive/LanguageManual/UDF=22_by_Ph?= =?utf-8?q?iloVivero?= Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for ch= ange notification. The "Hive/LanguageManual/UDF" page has been changed by PhiloVivero. The comment on this change is: Fixed reformulation, generalised it, specifi= ed a case whose reformulation is unknown.. http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=3Ddiff&rev1=3D= 63&rev2=3D64 -------------------------------------------------- = =3D=3D GROUPing and SORTing on f(column) =3D=3D = - If you would like to GROUP BY or SORT BY a column on which you've applied= a function, like this: + A typical OLAP pattern is that you have a timestamp column and you want t= o group by daily or other less granular date windows than by second. So you= might want to select concat(year(dt),month(dt)) and then group on that con= cat(). But if you attempt to GROUP BY or SORT BY a column on which you've a= pplied a function, like this: = {{{ select f(col) as fc, count(*) from table_name group by fc @@ -341, +341 @@ Because you are not able to GROUP BY or SORT BY a column on which a funct= ion has been applied. However, you can reformulate this query with subqueri= es: = {{{ - select sq.fc,count(*) from (select f(col) as fc) sq group by sq.fc + select sq.fc,count(*) from (select f(col) as fc from table_name) sq group= by sq.fc }}} = + You will have to specify all the columns you want along with the f(col) i= n both the subquery and the outside (which is obvious on retrospect). The g= eneral formula for the f(col) reformulation is: + = + {{{ + select sq.fc,col1,col2,...,colN,count(*) from + (select f(col) as fc,col1,col2,...,colN from table_name) sq + group by sq.fc,col1,col2,...,colN + }}} + = + Contact Tim Ellis (tellis) at RiotGames dot com if you would like to disc= uss this in further detail. +=20