hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nadeem Moidu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-3086) Skewed Join Optimization
Date Tue, 26 Jun 2012 17:17:43 GMT

    [ https://issues.apache.org/jira/browse/HIVE-3086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401530#comment-13401530
] 

Nadeem Moidu commented on HIVE-3086:
------------------------------------

@Alex, I'm sorry but your question is not very clear. Can you please give the exact schema,
query and the skewed keys that you have in mind. Here are some comments based on what I understood
from your question:
1. The bottleneck mentioned is only when the join key is skewed, so only that case is handled.
2. If a table is small, we have map-join to handle that.
3. We are not doing any pre-partioning.
                
> Skewed Join Optimization
> ------------------------
>
>                 Key: HIVE-3086
>                 URL: https://issues.apache.org/jira/browse/HIVE-3086
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Nadeem Moidu
>            Assignee: Nadeem Moidu
>
> During a join operation, if one of the columns has a skewed key, it can cause that particular
reducer to become the bottleneck. The following feature will address it:
> https://cwiki.apache.org/confluence/display/Hive/Skewed+Join+Optimization

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message