hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng Shao (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-267) Multi-GroupBy inserts with the same distinct expression should share the first map-reduce job
Date Mon, 02 Feb 2009 08:25:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12669525#action_12669525
] 

Zheng Shao commented on HIVE-267:
---------------------------------

Some preliminary thinking:

When there is a single shared global distinct expression:
1. When generating the first reducer for the GroupBy plans, put the distinct expression into
the key.
2. Let the query optimizer merge the reduceSinkOperators with the same source tables and key
value expressions. Note that there might be filter operators, and the query optimizer needs
to take a union of all rows that might pass any filter operators.
3. The reducer outputs need to be separated into several different sets, each for one GroupBy.

When there is no distinct expression:
1. Run a map-only job and do map-side aggregation for all group-bys.
2. Split the results into different sets.


Both require some infrastructure change: to be able to split the output of mappers/reducers
into several sets.


> Multi-GroupBy inserts with the same distinct expression should share the first map-reduce
job
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-267
>                 URL: https://issues.apache.org/jira/browse/HIVE-267
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Zheng Shao
>
> Currently multi-GroupBy inserts was done in a way that each GroupBy is separate.
> We should be able to optimize the plan.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message