hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1642) Convert join queries to map-join based on size of table/row
Date Mon, 15 Nov 2010 18:16:14 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12932132#action_12932132
] 

Liyin Tang commented on HIVE-1642:
----------------------------------

There are 2 kinds of backup. 1) task level 2) branch level. I think the way you mentioned
above is the branch level. The conditional task maintains a tree, if one branch fails, then
try another branch. 
I think, both of them is fine right now. But the branch level is more complicated to implement,
because the back up task may not be a single task but a tree of tasks. The design goal is
to replace one branch of task with another branch.

I think the problem right now is that there 2 tasks involved in MapJoin. Image that, 3 months
ago, there is no map join local task. It will be very easy to implement this. Once the mapjoin
task fails, we replace with the backup task. It is the task level backup. 

The problem is we split the map join task into 2 tasks.  But we can still logically argue
that the local task is PART of the map reduce task.  Actually, they do come from the same
task. That's why if it is the local task, we look ahead one more task. 

In the future, we may have more this kinds of situation, splitting one task into multiple
tasks. Then we may need a loop here. Say if this task is split from other tasks, keep looking
ahead.

Any other thoughts.

> Convert join queries to map-join based on size of table/row
> -----------------------------------------------------------
>
>                 Key: HIVE-1642
>                 URL: https://issues.apache.org/jira/browse/HIVE-1642
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Liyin Tang
>             Fix For: 0.7.0
>
>         Attachments: hive_1642_1.patch
>
>
> Based on the number of rows and size of each table, Hive should automatically be able
to convert a join into map-join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message