hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-964) handle skewed keys for a join in a separate job
Date Wed, 30 Dec 2009 20:06:29 GMT

    [ https://issues.apache.org/jira/browse/HIVE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795421#action_12795421
] 

Ning Zhang commented on HIVE-964:
---------------------------------

Some comments so far: 

1) GenMRSkewJoinProcessor.skewJoinEnabled() checks whether the join is any form of outer join
or not. Talked with Namit and this check seems unnecessary. The reason map join doesn't work
with outer join is that the small table contains all keys (not only those that contains a
match). In this case, we know exactly the keys will match if they are non-empty. So we can
handle Outer Joins as well.

2) can you add some unit tests for outer joins if the above is changed?

> handle skewed keys for a join in a separate job
> -----------------------------------------------
>
>                 Key: HIVE-964
>                 URL: https://issues.apache.org/jira/browse/HIVE-964
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>         Attachments: hive-964-2009-12-17.txt, hive-964-2009-12-28-2.patch, hive-964-2009-12-29-4.patch
>
>
> The skewed keys can be written to a temporary table or file, and a followup conditional
task can be used to perform the join on those keys.
> As a first step, JDBM can be used for those keys

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message