hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1110) add counters to show that skew join triggered
Date Thu, 28 Jan 2010 19:39:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12806047#action_12806047
] 

He Yongqiang commented on HIVE-1110:
------------------------------------

By introducing an boolean vector to keep track of which table has already got a skew key,
it will be able to tell how many tables have skew keys. And that can be used to tell how many
skew jobs will be started at least from the counter in that reducer. So if we choose the biggest
counter from all reducers, it will be the number of final jobs needed.

>>just increment the counter every time you see a new key.
This maybe better because sometimes i saw the counter is inaccurate. Even though there is
a skew key and the counter got updated, it still reports zero. So it maybe better if we increment
the counter multiple times, that maybe can hopefully let the reducer report a non-zero counter.

> add counters to show that skew join triggered
> ---------------------------------------------
>
>                 Key: HIVE-1110
>                 URL: https://issues.apache.org/jira/browse/HIVE-1110
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>         Attachments: hive-1110.patch
>
>
> It would be very useful to debug, and quickly find out if the skew join was triggered.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message