hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai (JIRA)" <>
Subject [jira] [Commented] (HIVE-2340) optimize orderby followed by a groupby
Date Thu, 17 Jan 2013 21:34:13 GMT


Yin Huai commented on HIVE-2340:

The current implementation of the patch of YSmart covers scenarios when a join or aggregation
operator share the same partition keys with its all parents (join or aggregation operators).

For example, a single MR job will be generated if all operators in the following plan share
the same partition keys.
       /            \              
GBY----              \
GBY--- -------------/

Also, it requires that the bottom join or aggregation operators which will be processed in
the same MR job take input tables instead of intermediate tables. In future, it should be
extended to cover scenarios that involve intermediate tables, that correlated operators share
common partition keys (not exactly the same keys), and that a join or aggregation operator
share common keys with some of its parents. 
> optimize orderby followed by a groupby
> --------------------------------------
>                 Key: HIVE-2340
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>              Labels: perfomance
>         Attachments: ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.1.patch, ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.2.patch,
ASF.LICENSE.NOT.GRANTED--HIVE-2340.D1209.5.patch, HIVE-2340.1.patch.txt
> Before implementing optimizer for JOIN-GBY, try to implement RS-GBY optimizer(cluster-by
following group-by).

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message