hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3815) Data Locality suffers if HDFS returns IPs in getFileBlockLocations
Date Tue, 07 Feb 2012 19:45:00 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13202685#comment-13202685

Siddharth Seth commented on MAPREDUCE-3815:

Looked at this a little more. 
This shows up when a split spans across multiple blocks. {{getFileBlockLocations}} always
returns hostnames. In case of multiple blocks, mapred.FileInputFormat ends up using {{BlockLocations.getTopologyPaths}}
instead of getFileBlockLocations - which returns an IP address.
Will open a MR / HDFS jira once I can find out how this API behaves in the 1.0 line. Anyone
happen to know ?

Meanwhile, changing the description and posting a patch to have the AM resolve IPs if they
show up.
> Data Locality suffers if HDFS returns IPs in getFileBlockLocations
> ------------------------------------------------------------------
>                 Key: MAPREDUCE-3815
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3815
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: mrv2
>    Affects Versions: 0.23.0
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>            Priority: Critical
>         Attachments: MR3815.txt
> BlockLocation.getHosts() returns IP addresses occasionally. Data locality is affected
- since the RM requires hostnames.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message