hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shan s <mysub...@gmail.com>
Subject Multi-GroupBy-Insert optimization
Date Fri, 01 Jun 2012 11:55:16 GMT
I am using Multi-GroupBy-Insert. I was expecting a single map-reduce job
which would club the group-bys together.
However it is scheduling n jobs where n = number of group bys..
Could you please explain this behaviour.

>From X
INSERT OVERWRITE LOCAL DIRECTORY 'output/y1'
SELECT a, b , c, count(*)
group by a,b,c
INSERT OVERWRITE LOCAL DIRECTORY 'output/y2'
SELECT  a ,  count(*)
group by a
INSERT OVERWRITE LOCAL DIRECTORY 'output/y3'
SELECT b,  count(*)
group by b
…..
…..
……

Mime
View raw message