hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1194) sorted merge join
Date Tue, 02 Mar 2010 00:34:06 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839935#action_12839935
] 

Namit Jain commented on HIVE-1194:
----------------------------------

There is a operator id which is unique - so the problem of different operator instance can
be solved

Each operator will access its local work. Currently, only map join operators will need them.
MapJoinOperator will get the complete small table in the beginning, whereas SMBJoinOperator
reads it
row by row.

ExecMapper does nothing

> sorted merge join
> -----------------
>
>                 Key: HIVE-1194
>                 URL: https://issues.apache.org/jira/browse/HIVE-1194
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>         Attachments: hive-1194-2010-02-28.patch
>
>
> If the input tables are sorted on the join key, and a mapjoin is being performed, it
is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed map joins
also.
> Since, sorted properties of a table are not enforced currently, a new parameter can be
added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message