hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-1296) Improve interface to FileSystem.getFileCacheHints
Date Wed, 25 Apr 2007 18:13:15 GMT
Improve interface to FileSystem.getFileCacheHints
-------------------------------------------------

                 Key: HADOOP-1296
                 URL: https://issues.apache.org/jira/browse/HADOOP-1296
             Project: Hadoop
          Issue Type: Improvement
          Components: fs
            Reporter: Owen O'Malley
         Assigned To: dhruba borthakur


The FileSystem interface provides a very limited interface for finding the location of the
data. The current method looks like:

String[][] getFileCacheHints(Path file, long start, long len) throws IOException

which returns a list of "block info" where the block info consists of a list host names. Because
the hints don't include the information about where the block boundaries are, map/reduce is
required to call the name node for each split. I'd propose that we fix the naming a bit and
make it:

public class BlockInfo extends Writable {
  public long getStart();
  public String[] getHosts();
}

BlockInfo[] getFileHints(Path file, long start, long len) throws IOException;

So that map/reduce can query about the entire file and get the locations in a single call.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message