hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8409) Address Hadoop path related issues on Windows
Date Fri, 01 Jun 2012 17:56:23 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287565#comment-13287565

Ivan Mitic commented on HADOOP-8409:

bq. In the meantime, would you please elaborate on how uri fragments are related to supporting
Sure. Symlinks/fragments are used for [Distributed Cache|http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html].
For example, to add files to Distributed Cache, you would do something like this in code:
DistributedCache.addCacheArchive(new URI("s3://bucket/path/to/archive.zip#directory"), job);
and this works fine. However, if you try to do something like this:
DistributedCache.addCacheArchive(new Path("s3://bucket/path/to/archive.zip#directory").toUri(),
this will fail because Path object does not support fragments. In this case, ‘#’ sign
will be encoded into the URI path: {{s3://bucket/path/to/archive.zip%23directory}} 

I run into this while fixing TestGenericOptionsParser that was failing on Windows.  However,
I found some [forum posts|https://forums.aws.amazon.com/message.jspa?messageID=152538] where
people actually used the incorrect pattern. This might be a good change orthogonally to what
we do for paths.  If you agree, maybe split fragment support into a new Jira?

bq. Please also look at the issues raised in HADOOP-8139 and the reasons why we did not support
windows paths on HDFS. 
Thanks Suresh, I am aware of the issues raised here.
> Address Hadoop path related issues on Windows
> ---------------------------------------------
>                 Key: HADOOP-8409
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8409
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, test, util
>    Affects Versions: 1.0.0
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>         Attachments: HADOOP-8409-branch-1-win.patch
>   Original Estimate: 168h
>  Remaining Estimate: 168h
> There are multiple places in prod and test code where Windows paths are not handled properly.
From a high level this could be summarized with:
> 1. Windows paths are not necessarily valid DFS paths (while Unix paths are)
> 2. Windows paths are not necessarily valid URIs (while Unix paths are)
> #1 causes a number of tests to fail because they implicitly assume that local paths are
valid DFS paths (by extracting the DFS test path from for example "test.build.data" property)
> #2 causes issues when URIs are directly created on path strings passed in by the user

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message