hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-2566) reduce the number map-reduce jobs for union all
Date Mon, 14 Nov 2011 22:31:52 GMT

    [ https://issues.apache.org/jira/browse/HIVE-2566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13150013#comment-13150013
] 

He Yongqiang commented on HIVE-2566:
------------------------------------

looks good, running tests.
                
> reduce the number map-reduce jobs for union all
> -----------------------------------------------
>
>                 Key: HIVE-2566
>                 URL: https://issues.apache.org/jira/browse/HIVE-2566
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: HIVE-2566.D405.1.patch, HIVE-2566.D405.2.patch
>
>
> A query like:
> select s.key, s.value from (
>   select key, value from src2  where key < 10
>   union all 
>   select key, value from src3  where key < 10
>   union all 
>   select key, value from src4  where key < 10
>   union all 
>   select key, count(1) as value from src5 group by key
> )s;
> should run the last sub-query 
> 'select key, count(1) as value from src5 group by key'
> as a map-reduce job.
> And then the union should be a map-only job reading from the first 3 map-only subqueries
> and the output of the last map-reduce job.
> The current plan is very inefficient.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message