Return-Path: X-Original-To: apmail-hive-user-archive@www.apache.org Delivered-To: apmail-hive-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6246AF3CD for ; Tue, 2 Apr 2013 17:23:06 +0000 (UTC) Received: (qmail 80264 invoked by uid 500); 2 Apr 2013 17:23:05 -0000 Delivered-To: apmail-hive-user-archive@hive.apache.org Received: (qmail 79911 invoked by uid 500); 2 Apr 2013 17:23:04 -0000 Mailing-List: contact user-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hive.apache.org Delivered-To: mailing list user@hive.apache.org Received: (qmail 79898 invoked by uid 99); 2 Apr 2013 17:23:04 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Apr 2013 17:23:04 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nitinpawar432@gmail.com designates 209.85.217.171 as permitted sender) Received: from [209.85.217.171] (HELO mail-lb0-f171.google.com) (209.85.217.171) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Apr 2013 17:22:58 +0000 Received: by mail-lb0-f171.google.com with SMTP id v10so726792lbd.30 for ; Tue, 02 Apr 2013 10:22:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=G3Gf2GpWLhVd3Tm98muKFsAShciSEBOh/s9MAHIR9Ns=; b=KsYx5+d2fu5stb6wWUTWYnagZx2B+UMbMoNEDl+R1AaS5GfGgizkyAsr5tT4rOvDXW o6N8iL22mDShFihevBoS7Rr3YJag+RPHoO5sCx7xVMbyP+k2Yt7e0onk9PBKUrtQe99n 4xiBZRQPonlsFr60kysSBVwVtRfNdiZm67tTj+oU1A5XQJnXjWkBOoCpYh8q0lpp0fQj ZWpGQp2e9qBrVaMITpyilOUm9m57dPCc1blqY88IQUMnRf4ecbJMq5afWV6ZS/w6p5XJ tVzVgygja1mpnW6qsVd8XVXHvyyBn/2jl4iR9/u9aPZSl9UZYBJDnlgXYZgFu6zX9DIM FU1A== MIME-Version: 1.0 X-Received: by 10.112.147.67 with SMTP id ti3mr5402234lbb.124.1364923358013; Tue, 02 Apr 2013 10:22:38 -0700 (PDT) Received: by 10.114.15.8 with HTTP; Tue, 2 Apr 2013 10:22:37 -0700 (PDT) In-Reply-To: <7E752D08-1141-47DD-A32B-ADB242AD6B44@keithwiley.com> References: <10A7E036-778F-410E-84B1-4759B5E21D7D@keithwiley.com> <844940D9-C255-41C7-BC72-86BCF8A48F9E@keithwiley.com> <7E752D08-1141-47DD-A32B-ADB242AD6B44@keithwiley.com> Date: Tue, 2 Apr 2013 22:52:37 +0530 Message-ID: Subject: Re: Need rank() From: Nitin Pawar To: user@hive.apache.org Content-Type: multipart/alternative; boundary=047d7b3441c4e670ac04d963fb2d X-Virus-Checked: Checked by ClamAV on apache.org --047d7b3441c4e670ac04d963fb2d Content-Type: text/plain; charset=ISO-8859-1 I am not sure about this but you defined a rank function and then aliased the column as rank itself. Can this cause the issue? >From the error it looks like it could not figure out which column is rank in the where clause "WHERE rank > 1" from Edward's blog the correct query looks similar but his function is named as p_rank SELECT category,country,product,sales,rank FROM ( SELECT category,country,product,sales, p_rank(category, country) rank FROM ( SELECT category,country,product, sales FROM p_rank_demo DISTRIBUTE BY category,country SORT BY category,country,sales desc) t1) t2 WHERE rank <= 3 On Tue, Apr 2, 2013 at 10:45 PM, Keith Wiley wrote: > I did all that, I just didn't paste it into the email. That isn't the > problem. Sorry for the confusion. > > hive> add jar Rank.jar; > hive> create temporary function rank as 'com.example.hive.udf.Rank'; > > BTW, small typo, the condition at the end uses less-then, not > greater-then...obviously...but that isn't the problem either. > > On Apr 2, 2013, at 10:06 , Nitin Pawar wrote: > > > I dont see you added rank in the classpath and defined rank function > > > > can you try add jar and define the function and try again > > > > > > > > On Tue, Apr 2, 2013 at 10:33 PM, Keith Wiley > wrote: > > On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote: > > > > > > http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/ > > > > Is there any possibility there is a bug in Ritesh Agrawal's query > statement from that article? I created a test table with the exact column > names from the example in the article and used a minimally altered version > of the command (I removed the where clause to simplify things a bit) and > got an error which suggests there is something slightly wrong with the > command (or perhaps the table has to be configured a special way). Here's > what I get when I almost perfectly duplicate that example: > > > > hive> describe test; > > OK > > user string > > category string > > value int > > Time taken: 0.082 seconds > > ================================================== > > hive> select * from test; > > OK > > user1 cat1 1 > > user1 cat1 2 > > user1 cat1 3 > > user1 cat2 10 > > user1 cat2 20 > > user1 cat2 30 > > user2 cat1 11 > > user2 cat1 21 > > user2 cat1 31 > > user2 cat2 5 > > user2 cat2 6 > > user2 cat2 7 > > Time taken: 0.202 seconds > > ================================================== > > hive> SELECT user, category, value > > > FROM ( > > > SELECT user, category, rank(user) as rank, value > > > FROM test > > > DISTRIBUTE BY user > > > SORT BY user, value desc > > > ) a > > > WHERE rank > 1 > > > ORDER BY user, rank; > > FAILED: Error in semantic analysis: Line 9:15 Invalid table alias or > column reference 'rank': (possible column names are: _col0, _col1, _col2) > > hive> > > > > > ________________________________________________________________________________ > > Keith Wiley kwiley@keithwiley.com keithwiley.com > music.keithwiley.com > > > > "What I primarily learned in grad school is how much I *don't* know. > > Consequently, I left grad school with a higher ignorance to knowledge > ratio than > > when I entered." > > -- Keith Wiley > > > ________________________________________________________________________________ > > > > > > > > > > -- > > Nitin Pawar > > > > ________________________________________________________________________________ > Keith Wiley kwiley@keithwiley.com keithwiley.com > music.keithwiley.com > > "It's a fine line between meticulous and obsessive-compulsive and a > slippery > rope between obsessive-compulsive and debilitatingly slow." > -- Keith Wiley > > ________________________________________________________________________________ > > -- Nitin Pawar --047d7b3441c4e670ac04d963fb2d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I am not sure about this=A0
but you defined a rank fun= ction and then aliased the column as rank itself. Can this cause the issue?= =A0

From the error it looks like it could no= t figure out which column is rank in the where clause "WHERE rank > 1"=A0

from Edward's blog the correct query lo= oks similar =A0but his function is named as p_rank=A0

<= /div>
SELECT
= =A0category,country,product,sales,rank
= FROM (
= =A0SELECT
= =A0=A0 category,country,product,sales,
= =A0=A0 p_rank(category, country) rank
= =A0FROM (
= =A0=A0 SELECT
= =A0=A0=A0=A0 category,country,product,
= =A0=A0=A0=A0 sales
= =A0=A0 FROM p_rank_demo
= =A0=A0 DISTRIBUTE BY
= =A0=A0=A0=A0 category,country
= =A0=A0 SORT BY
= =A0=A0=A0=A0 category,country,sales desc) t1) t2
= WHERE rank <=3D 3

<= br>
On Tue, Apr 2, 2013 at 10:45 PM, Keith Wiley <kwiley@keithwiley.com> wrote:
I did all that, I just didn't paste it into the email. =A0That isn'= t the problem. =A0Sorry for the confusion.

hive> add jar Rank.jar;
hive> create temporary function rank as 'com.example.hive.udf.Rank&#= 39;;

BTW, small typo, the condition at the end uses less-then, not greater-then.= ..obviously...but that isn't the problem either.

On Apr 2, 2013, at 10:06 , Nitin Pawar wrote:

> I dont see you added rank in the classpath and defined rank function >
> can you try add jar and define the function and try again
>
>
>
> On Tue, Apr 2, 2013 at 10:33 PM, Keith Wiley <kwiley@keithwiley.com> wrote:
> On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote:
>
> > http://ragrawal.wo= rdpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
>
> Is there any possibility there is a bug in Ritesh Agrawal's query = statement from that article? =A0I created a test table with the exact colum= n names from the example in the article and used a minimally altered versio= n of the command (I removed the where clause to simplify things a bit) and = got an error which suggests there is something slightly wrong with the comm= and (or perhaps the table has to be configured a special way). =A0Here'= s what I get when I almost perfectly duplicate that example:
>
> hive> describe test;
> OK
> user =A0 =A0string
> category =A0 =A0 =A0 =A0string
> value =A0 int
> Time taken: 0.082 seconds
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D
> hive> select * from test;
> OK
> user1 =A0 cat1 =A0 =A01
> user1 =A0 cat1 =A0 =A02
> user1 =A0 cat1 =A0 =A03
> user1 =A0 cat2 =A0 =A010
> user1 =A0 cat2 =A0 =A020
> user1 =A0 cat2 =A0 =A030
> user2 =A0 cat1 =A0 =A011
> user2 =A0 cat1 =A0 =A021
> user2 =A0 cat1 =A0 =A031
> user2 =A0 cat2 =A0 =A05
> user2 =A0 cat2 =A0 =A06
> user2 =A0 cat2 =A0 =A07
> Time taken: 0.202 seconds
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D
> hive> SELECT user, category, value
> =A0 =A0 > FROM (
> =A0 =A0 > SELECT user, category, rank(user) as rank, value
> =A0 =A0 > FROM test
> =A0 =A0 > DISTRIBUTE BY user
> =A0 =A0 > SORT BY user, value desc
> =A0 =A0 > ) a
> =A0 =A0 > WHERE rank > 1
> =A0 =A0 > ORDER BY user, rank;
> FAILED: Error in semantic analysis: Line 9:15 Invalid table alias or c= olumn reference 'rank': (possible column names are: _col0, _col1, _= col2)
> hive>
>
> ______________________________________________________________________= __________
> Keith Wiley =A0 =A0
kwiley@ke= ithwiley.com =A0 =A0 keithwiley.com =A0 =A0music.keithwiley.com
>
> "What I primarily learned in grad school is how much I *don't= * know.
> Consequently, I left grad school with a higher ignorance to knowledge = ratio than
> when I entered."
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0-- =A0Keith Wiley
> ______________________________________________________________________= __________
>
>
>
>
> --
> Nitin Pawar


___________________________________________________________________________= _____
Keith Wiley =A0 =A0 kwiley@keithwi= ley.com =A0 =A0 kei= thwiley.com =A0 =A0music.keithwiley.com

"It's a fine line between meticulous and obsessive-com= pulsive and a slippery
rope between obsessive-compulsive and debilitatingly slow."
=A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0-- =A0Keith Wile= y
___________________________________________________________________________= _____




-- Nitin Pawar
--047d7b3441c4e670ac04d963fb2d--