hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4044) Create symbolic links in HDFS
Date Tue, 23 Sep 2008 18:53:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12633842#action_12633842
] 

Doug Cutting commented on HADOOP-4044:
--------------------------------------

Some comments:
 - I don't like the 'vfs' package and naming.  Symbolic links should not be a distinguished
portion of the FileSystem API, but seamlessly integrated.  So I suggest that vfs/VfsStatus
be renamed to FSLinkable, vfs/VfsStatusBase to FSLink, vfs/VfsStatusBoolean to FSLinkBoolean,
vfs/VfsStatusFileStatus to LinkableFileStatus, etc.  If these are to be in a separate package,
it might be called 'fs/spi', since they are primarily needed only by implementors of the FileSystem
API, not by users.  The protected implementation methods should be called openImpl(), appendImpl(),
etc.
 - getLink() should return a Path, not a String.
 - getLink() should throw an exception when isLink() is false.
 - The check for link cycles is wrong.  If the loop starts after the first link traversed
it will not be detected.  A common approach is simply to limit the number of links traversed
to a constant.  Alternately you can keep a 'fast' and 'slow' pointer, incrementing the fast
pointer through the list twice as fast as the slow.  If they are ever equal then there's a
loop.  This will detect all loops.
 - I don't see the need for both getLink() and getRemainingPath().  Wouldn't it be simpler
to always have getLink() return a fully-qualified path?  Internally a FileSystem might support
relative paths, but why do we need to expose these?

Instead of repeating the link resolving loop in every method, we might use a "closure", e.g:
{code}
public FSInputStream open(Path p, final int bufferSize) throws IOException {
  return resolve(path, new FSLinkResolver<FSInputStream>() {
    FSInputStream next(Path p) throws IOException { return openImpl(p, bufferSize); }
};
{code}
where FSLinkResolver#resolve implements the loop-detection algorithm, calling #next to traverse
the list.


> Create symbolic links in HDFS
> -----------------------------
>
>                 Key: HADOOP-4044
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4044
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: symLink1.patch, symLink1.patch, symLink4.patch, symLink5.patch
>
>
> HDFS should support symbolic links. A symbolic link is a special type of file that contains
a reference to another file or directory in the form of an absolute or relative path and that
affects pathname resolution. Programs which read or write to files named by a symbolic link
will behave as if operating directly on the target file. However, archiving utilities can
handle symbolic links specially and manipulate them directly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message