hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shan s <mysub...@gmail.com>
Subject Re: Multi-GroupBy-Insert optimization
Date Mon, 04 Jun 2012 12:31:09 GMT
Anyone?
Thanks..

On Fri, Jun 1, 2012 at 5:25 PM, shan s <mysub987@gmail.com> wrote:

> I am using Multi-GroupBy-Insert. I was expecting a single map-reduce job
> which would club the group-bys together.
> However it is scheduling n jobs where n = number of group bys..
> Could you please explain this behaviour.
>
> From X
> INSERT OVERWRITE LOCAL DIRECTORY 'output/y1'
> SELECT a, b , c, count(*)
> group by a,b,c
> INSERT OVERWRITE LOCAL DIRECTORY 'output/y2'
> SELECT  a ,  count(*)
> group by a
> INSERT OVERWRITE LOCAL DIRECTORY 'output/y3'
> SELECT b,  count(*)
> group by b
> …..
> …..
> ……
>

Mime
View raw message