hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Namit Jain (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-964) handle skewed keys for a join in a separate job
Date Mon, 11 Jan 2010 05:55:54 GMT

    [ https://issues.apache.org/jira/browse/HIVE-964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12798578#action_12798578

Namit Jain commented on HIVE-964:

Did not take a look in great detail, but some high level comments:

1. Changes in ExecDriver are not needed
2. Skew Join should be a optimization step - I remember initially we had thought about it
and said it might be easy to do it at the end,
    but it makes more sense to plug it in the optimization phase. It can be the last optimization
step, and we can assume that map join
    conversions etc. have been done.
3. Condtitional Task: needs some rework. Since execute is not getting called recursively,
same thing should happen for initialize.
   It would be great if we can fold it in execute though - not sure how.
4. The numbers of jobs etc. should be correct - conditional task is not a single job, but

> handle skewed keys for a join in a separate job
> -----------------------------------------------
>                 Key: HIVE-964
>                 URL: https://issues.apache.org/jira/browse/HIVE-964
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>         Attachments: hive-964-2009-12-17.txt, hive-964-2009-12-28-2.patch, hive-964-2009-12-29-4.patch,
> The skewed keys can be written to a temporary table or file, and a followup conditional
task can be used to perform the join on those keys.
> As a first step, JDBM can be used for those keys

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message