hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ahad Rana (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1783) keyToPath in Jets3tFileSystemStore needs to return absolute path
Date Sat, 01 Sep 2007 00:19:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524238

Ahad Rana commented on HADOOP-1783:

Hi Tom,

I will try to produce some stack traces for you. But, ultimately, if you look at the DistributedFileSystem
implementation of listPaths, it clearly creates fully qualified paths using the DfsPath(DFSFileInfo,FileSystem)
constructor. In the case of the s3 implementation, the listPaths, as I mentioned, returns
sub-paths without the scheme or the bucket name (authorization). If the default file system
is not s3, then the hadoop library returns improper results by trying to resolve the returned
sub-path against the default FileSystem ( since the scheme is missing from the path object).

I am working on enabling map-reduce functionality for scenarios where either both, or at least
one file specification (map input, and reduce output)  in a map reduce spec points to the
s3 file system. The above mentioned bug breaks the code in a couple of different places. When
I implement keytoPath in Jet3FileSystemStore as follows, everything works. 

private Path keyToPath(String key) {
    return new Path("s3://"+bucket.getName()+key);

Suffice it to say, there are other (performance related) issues that I am also looking at
in order to enable satisfactory use of s3 as a potential input/output for a mapreduce job.
But, by far, this bug is the most critically broken issue. 

Sorry about the lack of stack traces. I just need to recreate a proper test environment to
get you these, and hopefully I will be able to submit something to you next week. 



> keyToPath in Jets3tFileSystemStore needs to return absolute path
> ----------------------------------------------------------------
>                 Key: HADOOP-1783
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1783
>             Project: Hadoop
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.1.0, 0.1.1, 0.2.0, 0.2.1, 0.3.0, 0.3.1, 0.3.2, 0.4.0, 0.5.0, 0.6.0,
0.6.1, 0.6.2, 0.7.0, 0.7.1, 0.7.2, 0.8.0, 0.9.0, 0.9.1, 0.9.2, 0.10.0, 0.10.1, 0.11.0, 0.11.1,
0.11.2, 0.12.0, 0.12.1, 0.12.2, 0.12.3, 0.13.0, 0.13.1, 0.14.0
>         Environment: hadoop 0.14.0 running under ec2 with s3 filesystem
>            Reporter: Ahad Rana
> The keyToPath method probably needs to:
> 1. take the bucket identifier as a parameter.
> 2. set the returned Path object's protocol plus authority (bucket). Currently, APIs such
as <i>listSubPaths</i> return relative paths (for a directory listing). This in
turn breaks map reduce operations if the default file system is set to be something other
than S3 (via fs.default.name, for example). 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message