hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Navis (JIRA)" <>
Subject [jira] [Assigned] (HIVE-1772) optimize join followed by a groupby
Date Sun, 06 May 2012 23:54:48 GMT


Navis reassigned HIVE-1772:

    Assignee:     (was: Navis)

@Radhika Malik
I thought YSMART(HIVE-2206) seemed to be merged shortly so I abandoned this. But you can continue
if you want.
> optimize join followed by a groupby
> -----------------------------------
>                 Key: HIVE-1772
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>         Attachments: HIVE-1772.1.patch
> explain SELECT x.key, count(1) FROM src1 x JOIN src y ON (x.key = y.key) group by x.key;
>   Stage-1 is a root stage
>   Stage-2 depends on stages: Stage-1
>   Stage-0 is a root stage
> The above query issues 2 map-reduce jobs. 
> The first MR job performs the join, whereas the second MR performs the group by.
> Since the data is already sorted, the group by can be performed in the reducer of the
join itself.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message