hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sreekanth Ramakrishnan (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1695) MapJoin followed by ReduceSink should be done as single MapReduce Job
Date Thu, 02 Dec 2010 09:58:11 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12966044#action_12966044
] 

Sreekanth Ramakrishnan commented on HIVE-1695:
----------------------------------------------

The initial thought of mine was to change the processing of MapJoin followed by select operator.
But there was comment in MapJoinFactory class which stated that, while walking down the node
stack, we only process the current element or its parent and leave children process to the
child processor code.

Is it ok to remove the MapJoin%Sel and change the processing in MapJoin%RS to make it clearer?

I have handled the group by case, Will be attaching how the Plan look like in next comment.

> MapJoin followed by ReduceSink should be done as single MapReduce Job
> ---------------------------------------------------------------------
>
>                 Key: HIVE-1695
>                 URL: https://issues.apache.org/jira/browse/HIVE-1695
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Amareshwari Sriramadasu
>
> Currently MapJoin followed by ReduceSink runs as two MapReduce jobs : One map only job
followed by a Map-Reduce job. It can be combined into single MapReduce Job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message