hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1194) sorted merge join
Date Tue, 02 Mar 2010 00:20:05 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839932#action_12839932
] 

He Yongqiang commented on HIVE-1194:
------------------------------------

yes, we can do that. there are two problems need to resolve;
(1) serialize and deserialize the mapping. We generate the mapping at compile time, and the
operator instance is different then the one in runtime. 
(2) the fetchOperators need to be accessed in SMBMapJoinOperator. need to pass these from
exec-mapper to SMBMapJoinOperator

I just made a small changes,
i added a new method initializeLocalWork() in Operator. In exec-mapper, the mapoperator's
initializeLocalWork() is called, and triggered all its children's initializeLocalWork().

> sorted merge join
> -----------------
>
>                 Key: HIVE-1194
>                 URL: https://issues.apache.org/jira/browse/HIVE-1194
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>         Attachments: hive-1194-2010-02-28.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being performed, it
is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed map joins
also.
> Since, sorted properties of a table are not enforced currently, a new parameter can be
added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message