hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Wiley <kwi...@keithwiley.com>
Subject Re: Need rank()
Date Tue, 02 Apr 2013 19:44:02 GMT
I suppose.  Like I said, I was just following the example on that website and it didn't seem
to have any comments on it remarking on errors in the article.  Thus my confusion about the
whole thing.  It is presented as being accurate and is old enough that any errors would have
been caught and labeled in the comments long ago.  I also thought it was a little odd to name
the function the same thing as the alias assigned to the result.  I'll try disambiguating
that.

On Apr 2, 2013, at 10:22 , Nitin Pawar wrote:

> I am not sure about this 
> but you defined a rank function and then aliased the column as rank itself. Can this
cause the issue? 
> 
> From the error it looks like it could not figure out which column is rank in the where
clause "WHERE rank > 1" 
> 
> from Edward's blog the correct query looks similar  but his function is named as p_rank

> 
> SELECT
>  category,country,product,sales,rank
> FROM (
>  SELECT
>    category,country,product,sales,
>    p_rank(category, country) rank
>  FROM (
>    SELECT
>      category,country,product,
>      sales
>    FROM p_rank_demo
>    DISTRIBUTE BY
>      category,country
>    SORT BY
>      category,country,sales desc) t1) t2
> WHERE rank <= 3
> 
> 
> On Tue, Apr 2, 2013 at 10:45 PM, Keith Wiley <kwiley@keithwiley.com> wrote:
> I did all that, I just didn't paste it into the email.  That isn't the problem.  Sorry
for the confusion.
> 
> hive> add jar Rank.jar;
> hive> create temporary function rank as 'com.example.hive.udf.Rank';
> 
> BTW, small typo, the condition at the end uses less-then, not greater-then...obviously...but
that isn't the problem either.
> 
> On Apr 2, 2013, at 10:06 , Nitin Pawar wrote:
> 
> > I dont see you added rank in the classpath and defined rank function
> >
> > can you try add jar and define the function and try again
> >
> >
> >
> > On Tue, Apr 2, 2013 at 10:33 PM, Keith Wiley <kwiley@keithwiley.com> wrote:
> > On Apr 1, 2013, at 16:12 , Alexander Pivovarov wrote:
> >
> > > http://ragrawal.wordpress.com/2011/11/18/extract-top-n-records-in-each-group-in-hadoophive/
> >
> > Is there any possibility there is a bug in Ritesh Agrawal's query statement from
that article?  I created a test table with the exact column names from the example in the
article and used a minimally altered version of the command (I removed the where clause to
simplify things a bit) and got an error which suggests there is something slightly wrong with
the command (or perhaps the table has to be configured a special way).  Here's what I get
when I almost perfectly duplicate that example:
> >
> > hive> describe test;
> > OK
> > user    string
> > category        string
> > value   int
> > Time taken: 0.082 seconds
> > ==================================================
> > hive> select * from test;
> > OK
> > user1   cat1    1
> > user1   cat1    2
> > user1   cat1    3
> > user1   cat2    10
> > user1   cat2    20
> > user1   cat2    30
> > user2   cat1    11
> > user2   cat1    21
> > user2   cat1    31
> > user2   cat2    5
> > user2   cat2    6
> > user2   cat2    7
> > Time taken: 0.202 seconds
> > ==================================================
> > hive> SELECT user, category, value
> >     > FROM (
> >     > SELECT user, category, rank(user) as rank, value
> >     > FROM test
> >     > DISTRIBUTE BY user
> >     > SORT BY user, value desc
> >     > ) a
> >     > WHERE rank > 1
> >     > ORDER BY user, rank;
> > FAILED: Error in semantic analysis: Line 9:15 Invalid table alias or column reference
'rank': (possible column names are: _col0, _col1, _col2)
> > hive>
> >
> > ________________________________________________________________________________
> > Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
> >
> > "What I primarily learned in grad school is how much I *don't* know.
> > Consequently, I left grad school with a higher ignorance to knowledge ratio than
> > when I entered."
> >                                            --  Keith Wiley
> > ________________________________________________________________________________
> >
> >
> >
> >
> > --
> > Nitin Pawar
> 
> 
> ________________________________________________________________________________
> Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com
> 
> "It's a fine line between meticulous and obsessive-compulsive and a slippery
> rope between obsessive-compulsive and debilitatingly slow."
>                                            --  Keith Wiley
> ________________________________________________________________________________
> 
> 
> 
> 
> -- 
> Nitin Pawar


________________________________________________________________________________
Keith Wiley     kwiley@keithwiley.com     keithwiley.com    music.keithwiley.com

"You can scratch an itch, but you can't itch a scratch. Furthermore, an itch can
itch but a scratch can't scratch. Finally, a scratch can itch, but an itch can't
scratch. All together this implies: He scratched the itch from the scratch that
itched but would never itch the scratch from the itch that scratched."
                                           --  Keith Wiley
________________________________________________________________________________


Mime
View raw message