hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lu, Wei" <...@microstrategy.com>
Subject RE: how to split query result into several smaller tables without creating temp table??
Date Wed, 07 Mar 2012 02:13:38 GMT
Sorry, query 1 should be:
create table tmp__imp as select requestbegintime, count(*) from impressions2 where requestbegintime<'1239572996000'
group by requestbegintime;

-----Original Message-----
From: Lu, Wei [mailto:wlu@microstrategy.com] 
Sent: Wednesday, March 07, 2012 9:08 AM
To: user@hive.apache.org
Subject: RE: how to split query result into several smaller tables without creating temp table??

Hi, Mark

Query 1 is:
1) create table tmp__imp as select requestbegintime, count(*) from impressions2 where requestbegintime<'1239572996000';


from tmp__imp 
insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where requestbegintime<'1239572956000'

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where requestbegintime>='1239572956000';

Do you have any idea about the failure of query 2 and 3? Is the failure caused by some issues
of Hive?


Regards,
Wei



-----Original Message-----
From: Mark Grover [mailto:mgrover@oanda.com] 
Sent: Wednesday, March 07, 2012 1:11 AM
To: user@hive.apache.org
Subject: Re: how to split query result into several smaller tables without creating temp table??

Hi Wei,
In query 1, it's invalid to requestbegintime in the select list if it's not in the group by
clause. There doesn't seem to be a group by clause there. Is that the right query?

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 


----- Original Message -----
From: "Wei Lu" <wlu@microstrategy.com>
To: user@hive.apache.org
Sent: Tuesday, March 6, 2012 4:40:00 AM
Subject: how to split query result into several smaller tables without creating temp table??




Hi, 



I tried to do aggregation based on Table impressions2, and then need to save the results to
two different local files (or tables). 

I tried three methods, only the first one succeeded: 



1) create a new table and store aggregation results to it, and then use multi-insert to split
the results to local disk: 

create table tmp__imp as select requestbegintime, count(*) from impressions2 where 



requestbegintime<'1239572996000'; 

from tmp__imp 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where requestbegintime<'1239572956000'


insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where requestbegintime>='1239572956000';




The method works, and #rows of each local file: 

40 Rows loaded to /disk2/is2 

31057 Rows loaded to /disk2/is1 



2) use one statement without create new table: 

from (select requestbegintime, count(*) as ct from impressions2 where requestbegintime<'1239572996000'
GROUP BY requestbegintime)mt 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select mt.requestbegintime,mt.ct where mt.requestbegintime<'1239572956000'


insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select mt.requestbegintime,mt.ct where mt.requestbegintime>='1239572956000';




But all aggregated results are copies to both files: 

31097 Rows loaded to /disk2/is2 

31097 Rows loaded to /disk2/is1 



3) create a view and store aggregation results to it, and then use multi-insert to split the
results to local disk: 

create view view_imp(rt, ct) as select requestbegintime, count(*) as ct from impressions2
where requestbegintime<'1239572996000' GROUP BY requestbegintime; 

from view_imp 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is1' select * where rt<'1239572956000' 

insert OVERWRITE LOCAL DIRECTORY '/disk2/is2' select * where rt>='1239572956000'; 



But, again, all aggregated results are copies to both files: 

31097 Rows loaded to /disk2/is2 

31097 Rows loaded to /disk2/is1 



Why does 2) and 3) fail? How can I split the results without creating new table? 





Regards, 

Wei 
Mime
View raw message