hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Mitic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8409) Address Hadoop path related issues on Windows
Date Fri, 01 Jun 2012 17:56:23 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287565#comment-13287565
] 

Ivan Mitic commented on HADOOP-8409:
------------------------------------

bq. In the meantime, would you please elaborate on how uri fragments are related to supporting
symlinks?
Sure. Symlinks/fragments are used for [Distributed Cache|http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/filecache/DistributedCache.html].
For example, to add files to Distributed Cache, you would do something like this in code:
{code}
DistributedCache.addCacheArchive(new URI("s3://bucket/path/to/archive.zip#directory"), job);
{code}
and this works fine. However, if you try to do something like this:
{code}
DistributedCache.addCacheArchive(new Path("s3://bucket/path/to/archive.zip#directory").toUri(),
job);
{code}
this will fail because Path object does not support fragments. In this case, ‘#’ sign
will be encoded into the URI path: {{s3://bucket/path/to/archive.zip%23directory}} 

I run into this while fixing TestGenericOptionsParser that was failing on Windows.  However,
I found some [forum posts|https://forums.aws.amazon.com/message.jspa?messageID=152538] where
people actually used the incorrect pattern. This might be a good change orthogonally to what
we do for paths.  If you agree, maybe split fragment support into a new Jira?

bq. Please also look at the issues raised in HADOOP-8139 and the reasons why we did not support
windows paths on HDFS. 
Thanks Suresh, I am aware of the issues raised here.
                
> Address Hadoop path related issues on Windows
> ---------------------------------------------
>
>                 Key: HADOOP-8409
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8409
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs, test, util
>    Affects Versions: 1.0.0
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>         Attachments: HADOOP-8409-branch-1-win.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> There are multiple places in prod and test code where Windows paths are not handled properly.
From a high level this could be summarized with:
> 1. Windows paths are not necessarily valid DFS paths (while Unix paths are)
> 2. Windows paths are not necessarily valid URIs (while Unix paths are)
> #1 causes a number of tests to fail because they implicitly assume that local paths are
valid DFS paths (by extracting the DFS test path from for example "test.build.data" property)
> #2 causes issues when URIs are directly created on path strings passed in by the user

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message