hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Runping Qi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3293) When an input split spans cross block boundary, the split location should be the host having most of bytes on it.
Date Thu, 30 Oct 2008 16:40:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644025#action_12644025
] 

Runping Qi commented on HADOOP-3293:
------------------------------------

In the above case, I'd say that the prefer hosts for the split should be in the order of A,D,B,E,C,F.
We should also aggregate the bytes over the racks of those hosts.
For example, suppose C,E,F share  the same rack while other nodes are on different rack.
Then host E (F, and even C) will offer better rack locality than other hosts.
In practice, rack locality is almost as good as node locality.
 

> When an input split spans cross block boundary, the split location should be the host
having most of bytes on it. 
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-3293
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3293
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Runping Qi
>            Assignee: Jothi Padmanabhan
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message