hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Grover <mgro...@oanda.com>
Subject Re: how to avoid scan the same table multi times?
Date Fri, 13 Jan 2012 15:41:53 GMT
Xiaobin,
I would assume that's what Martin meant. You can always add dashes to make it more readable:-)

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 
e: mgrover@oanda.com 

"Best Trading Platform" - World Finance's Forex Awards 2009. 
"The One to Watch" - Treasury Today's Adam Smith Awards 2009. 


----- Original Message -----
From: "Xiaobin She" <xiaobinshe@gmail.com>
To: user@hive.apache.org, "佘晓彬" <xiaobinshe@gmail.com>
Sent: Thursday, January 12, 2012 11:01:19 PM
Subject: Re: how to avoid scan the same table multi times?

to Martin, Mark and Edward, 

thank you for your advices, I will try it out. 

And to Martin, by "appropriate data format", do you mean something like "2012011202" ? 

thanks! 

xiaobin 


在 2012年1月12日 下午10:20,Martin Kuhn < martin.kuhn@affinitas.de > 写道:



Hi there, 


> Select count(*), count(distinct u), type from t group by type where plat=1 and dt=”2012-1-12-02”


> Select count(*), count(distinct u), type from t where (type =2 or type =6) and dt=”2012-1-12-02”
group by type; 


> Is there a better way to do these queries? 

You could try something like this: 

SELECT 
type 
, count(*) 
, count(DISTINCT u) 
, count(CASE WHEN plat=1 THEN u ELSE NULL) 
, count(DISTINCT CASE WHEN plat=1 THEN u ELSE NULL) 
, count(CASE WHEN (type=2 OR type=6) THEN u ELSE NULL) 
, count(DISTINCT CASE WHEN (type=2 OR type=6) THEN u ELSE NULL) 
FROM 
t 
WHERE 
dt in ("2012-1-12-02", "2012-1-12-03") 
GROUP BY 
type 
ORDER BY 
type 
; 

Good luck :) 
Martin Kuhn 


P.S. You'ge got a strange date format there. For sorting purposes it would be more appropriate
to use something like " 2012-01-12-02 ". 



Mime
View raw message