hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3370) HDFS hardlink
Date Wed, 13 Jun 2012 07:35:44 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294229#comment-13294229

Konstantin Shvachko commented on HDFS-3370:

> I would recommend finding a different approach to implementing snapshots than adding
this feature.

I agree with Srivas, hard links seem easy in single-NameNode architecture, but they are very
hard to support when the namespace is distributed, because if links to a file belong to different
nodes you cannot just lock the entire namespace and do atomic cross-node linking / unlinking.
I also agree with Srivas that hard links in traditional file systems cause more problems than
add value.
Looking at the design document I see that you create sort of internal symlinks called INodeHardLinkFile
pointing to HardLinkFileInfo, representing the actual file. This can be modeled by symlinks
on the application (HBase) level without making any changes in HDFS.

I strongly discourage bringing this feature inside HDFS. 
Or provide use cases which *cannot* be solved without it.
> HDFS hardlink
> -------------
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLink.pdf
> We'd like to add a new feature hardlink to HDFS that allows harlinked files to share
data without copying. Currently we will support hardlinking only closed files, but it could
be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are primarily used in
> 1. This provides a lightweight way for applications like hbase to create a snapshot;
> 2. This also allows an application like Hive to move a table to a different directory
without breaking current running hive queries.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message