hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9912) globStatus of a symlink to a directory does not report symlink as a directory
Date Fri, 20 Sep 2013 14:21:51 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13773027#comment-13773027
] 

Jason Lowe commented on HADOOP-9912:
------------------------------------

bq. Another crazy thought I'd like to throw out - what if we just returned false for isDir
if we cannot resolve the symlink rather than throw an exception?

This sounds equivalent to the earlier proposal where "bad" symlinks are returned as the raw
symlink.  isDir() and isFile() both return false for symlinks, and old clients are not aware
of isFile() since it was added with symlink support.

An old client of listStatus will interpret the link as a file since isDir() is false, but
we don't know if that's the proper thing to do since we don't know the client's intent.  If
a directory walker is concerned about directories and not files at some point in the traverse,
it could end up silently skipping a "bad" symlink when it should have failed.  i.e.: symlink
to directory in remote filesystem but filesystem is temporarily unavailable, symlink to directory
in permission-protected tree, symlink intended to point to a directory but typo'd the target
when link was created, etc.

I'm not sure how common that case really is in practice.  Our recent proposal is trying to
err on the side of caution so we don't accidentally drop data when we should have failed.
 It does mean some scenarios for old clients will fail when they should have succeeded despite
"bad" symlinks, but it seems better to report a failure that can be corrected (i.e.: fix the
"bad" symlink and re-run the app) than to potentially skip desired inputs.
                
> globStatus of a symlink to a directory does not report symlink as a directory
> -----------------------------------------------------------------------------
>
>                 Key: HADOOP-9912
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9912
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Priority: Blocker
>         Attachments: HADOOP-9912-testcase.patch, new-hdfs.txt, new-local.txt, old-hdfs.txt,
old-local.txt
>
>
> globStatus for a path that is a symlink to a directory used to report the resulting FileStatus
as a directory but recently this has changed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message