hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Isaacson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3370) HDFS hardlink
Date Tue, 12 Jun 2012 22:15:43 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293965#comment-13293965

Andy Isaacson commented on HDFS-3370:

When users run "cp" in the linux file system against hard linked files, it will copy the bytes,

{{cp -a}} preserves hard links; {{cp -r}} breaks them (duplicates the bytes).

I think we shall keep the same semantics here as well. 

I don't think it's a good idea to pretend that we can or should preserve *every* corner case
of the semantics of POSIX hard links.  The Unix hard link was originally a historical accident
of the inode/dentry structure of the filesystem, preserved because it's useful and has been
heavily relied upon by users of the Unix api.  The implementation in something like ZFS or
btrfs is pretty far away from the original simplicity.

Since we don't have API compatibility with Unix and our underlying structure is deeply different,
it's a good idea to borrow the good ideas but take a practical eye to where it makes sense
to diverge.
> HDFS hardlink
> -------------
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLink.pdf
> We'd like to add a new feature hardlink to HDFS that allows harlinked files to share
data without copying. Currently we will support hardlinking only closed files, but it could
be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are primarily used in
> 1. This provides a lightweight way for applications like hbase to create a snapshot;
> 2. This also allows an application like Hive to move a table to a different directory
without breaking current running hive queries.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message