hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4019) FSShell should support creating symlinks
Date Tue, 21 Jan 2014 21:14:21 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877861#comment-13877861

Chris Nauroth commented on HDFS-4019:

Hi, Colin.  This patch mostly looks good to me.  I took it for a test run, and it worked as
advertised.  Could you please update FileSystemShell.apt.vm for the new command?

I did notice a discrepancy in the behavior compared to local file system ln implementations
I've used.  If the link_path is already a symlink that exists pointing to a directory, then
the default behavior I've seen is to follow the symlink and put the new symlink underneath
that target directory.  See below for an example run on OS X to demonstrate it.

When I run through the same sequence on HDFS, the behavior is different.  Instead, the second
ln fails due to file already exists.

The symlink following behavior is typically disabled with the -n or -h flag (resulting in
file already exists error), and then additionally you can add -f to force overwrite of link_path
(resulting in successful replacement of the link with the new target).  It seems the current
patch is doing the equivalent of ln -sn (but not the -f flag).

I can think of a few different ways we can handle this:
# Add more code in the shell to traverse {{linkPath}} if it exists and is a symlink.
# Don't code for this case, but force callers to specify -n, just like the patch currently
forces them to specify -s.
# Document the difference in behavior.

Additionally, I'd like to see support for a -f flag eventually.  I think we're going to need
different API support from the NameNode for this though, so that's probably going to be a
different patch.  ln -sfn can be a useful building block for atomically publishing changes
to a file.  For example, you could have readers access a file named "currentView", which is
really a symlink to the last published version of the file.  Then, a writer can stage an update
in a separate file and use ln -sfn to swap in the changes atomically.

Let me know your thoughts on this.  Thanks.

[chris@Chriss-MacBook-Pro:ttys006] lntest                                                
> mkdir dir1 dir2
[chris@Chriss-MacBook-Pro:ttys006] lntest                                                
> ln -s dir1 link1
[chris@Chriss-MacBook-Pro:ttys006] lntest                                                
> ln -s dir2 link1
[chris@Chriss-MacBook-Pro:ttys006] lntest                                                
> tree
├── dir1
│   └── dir2 -> dir2
├── dir2
└── link1 -> dir1

3 directories, 1 file

> FSShell should support creating symlinks
> ----------------------------------------
>                 Key: HDFS-4019
>                 URL: https://issues.apache.org/jira/browse/HDFS-4019
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 2.0.3-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
>         Attachments: HDFS-4019.001.patch, HDFS-4019.002.patch, HDFS-4019.003.patch
> FSShell should support creating symlinks.  This would allow users to create symlinks
from the shell without having to write a Java program.
> One thing that makes this complicated is that FSShell currently uses FileSystem internally,
and symlinks are currently only supported by the FileContext API.  So either FSShell would
have to be ported to FileContext, or symlinks would have to be added to FileSystem.  Or perhaps
we could open a FileContext only when symlinks were necessary, but that seems messy.

This message was sent by Atlassian JIRA

View raw message