hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Saurabh S <saurab...@live.com>
Subject Hive equivalent of row_number()
Date Thu, 12 Apr 2012 20:43:58 GMT

I have a table with three columns, A, B, and Score, where A and B are some items, and Score
is some kind of affinity between A and B. There are N number of items of each A and B, so
that the total number of rows in the table are N^2.

Is there a way to fetch "top 5 items in B" for each item in A? So, for each distinct item
in A, I want to look up 5 items in B which have the highest value in Score.

If this were to be done in DB2, I would probably use some kind of windowing function using
row_number().
 		 	   		  
Mime
View raw message