hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "r7raul1984@163.com" <r7raul1...@163.com>
Subject Re: Re: How to query data by page in Hive?
Date Wed, 25 Feb 2015 00:48:20 GMT
Thank you ~



r7raul1984@163.com
 
From: Devopam Mittra
Date: 2015-02-24 20:15
To: user@hive.apache.org
Subject: Re: Re: How to query data by page in Hive?
Apologies for the extended delay in providing response. I totally forgot about this one and
got stuck in other things.

Please find below a sample I created based on the inputs received.

CREATE TABLE u_data( 
userid int,
movieid int,
rating int,
unixtime string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t' ;

file contents of 'temp_load'
1    1020    8    10223232
1    1045    7    1232131
1    1234    8    1231231
1    1244    4    1212113
2    1020    9    1231213
2    1244    1    123121121
3    1020    6    123211
5    1020    4    12312321
6    1027    4    1231231

LOAD DATA LOCAL INPATH '/home/exilant/Desktop/temp_load' OVERWRITE INTO TABLE u_data;

WITH movietable AS (SELECT a.movieid, 
                           a.rating,
                           ROW_NUMBER() OVER (ORDER BY movieid) AS rownum
                    FROM ( SELECT movieid,
                                  AVG(rating) AS rating
                           FROM u_data
                           GROUP BY movieid) A)
SELECT movieid,rating
FROM movietable 
WHERE rownum BETWEEN 3 AND 5;

So, now I can achieve pagination by suitably choosing the values for rownum that I can calculate
based on which page I am and how many records per page are there.

e.g. 12 records per page will mean 3rd page will start from 
12*3+1 and end at 12*4     ( BETWEEN pagesize*pagenum+1 AND pagesize*(pagenum+1))

Hope it helps

regards
Devopam


On Fri, Feb 6, 2015 at 6:18 AM, r7raul1984@163.com <r7raul1984@163.com> wrote:
Table structure :
 CREATE TABLE `u_data`( 
`userid` int, 
`movieid` int, 
`rating` int, 
`unixtime` string) 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY '\t' 
STORED AS INPUTFORMAT 
'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat' 
LOCATION 
'hdfs://localhost:8020/user/hive/warehouse/u_data' 
TBLPROPERTIES ( 
'COLUMN_STATS_ACCURATE'='true', 
'numFiles'='1', 
'numRows'='0', 
'rawDataSize'='0', 
'totalSize'='1979173', 
'transient_lastDdlTime'='1421076916')

columns :
   movieid
                           


r7raul1984@163.com
 
From: Devopam Mittra
Date: 2015-02-05 18:48
To: user@hive.apache.org
Subject: Re: Re: How to query data by page in Hive?
Please provide a valid table structure and the columns you wish to pick and I shall email
you the query directly 


regards
Devopam

On Thu, Feb 5, 2015 at 3:20 PM, r7raul1984@163.com <r7raul1984@163.com> wrote:
Thank you Devopam! Could you show me a  example? 



r7raul1984@163.com
 
From: Devopam Mittra
Date: 2015-02-05 18:05
To: user@hive.apache.org
Subject: Re: How to query data by page in Hive?
You may want to use a ROW_NUMBER OR RANK / DENSE RANK in the inner query and then select only
a subset of it in the outer query to control pagination. Based on your need, you may want
to order the records as well ..

Alternatively you may want to use CTE(https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression)
for selecting the data in one go and then use row number to select as in previous case.

regards
Devopam

On Thu, Feb 5, 2015 at 1:31 PM, r7raul1984@163.com <r7raul1984@163.com> wrote:
Hello,
         How to query data by page in Hive?

hive> select * from u_data a limit 1,2; 
FAILED: ParseException line 1:31 missing EOF at ',' near '1' 



r7raul1984@163.com



-- 
Devopam Mittra
Life and Relations are not binary



-- 
Devopam Mittra
Life and Relations are not binary



-- 
Devopam Mittra
Life and Relations are not binary
Mime
View raw message