hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmytro Molkov (JIRA)" <j...@apache.org>
Subject [jira] Created: (MAPREDUCE-1752) Implement getFileBlockLocations in HarFilesystem
Date Wed, 05 May 2010 02:02:02 GMT
Implement getFileBlockLocations in HarFilesystem
------------------------------------------------

                 Key: MAPREDUCE-1752
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1752
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
            Reporter: Dmytro Molkov


To efficiently run map reduce on the data that has been HAR'ed it will be great to actually
implement getFileBlockLocations for a given filename.
This way the JobTracker will have information about data locality and will schedule tasks
appropriately.
I believe the overhead introduced by doing lookups in the index files can be smaller than
that of copying data over the wire.
Will upload the patch shortly, but would love to get some feedback on this. And any ideas
on how to test it are very welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message