hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liyin Tang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3370) HDFS hardlink
Date Tue, 08 May 2012 18:55:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13270715#comment-13270715
] 

Liyin Tang commented on HDFS-3370:
----------------------------------

@Daryn Sharp: very good comments :)
1) Quota is the trickest for the hard link. 

For nsquota usage, it will be added up when creating hardlinks and be decreased when removing
hardlinks. 

For dsquota usage, it will only increase and decrease the quota usage for the directories,
which are not any common ancestor directories with any linked files. 
For example, "ln /root/dir1/file1 /root/dir1/file2" : there is no need to increase the ds
quota usage when creating the link file: file2. 
Also "rm /root/dir1/file1" : there is no need to decrease the ds quota usage when removing
the original source file: file1. 

The bottom line is there is no such case that we need to increase any dsquota during the file
removal operation. Because if the directory is a common ancestor directory, no dsquota needs
to be updated, otherwise the dsquota has already been updated during the hard link created
time.


2) You are right that each blockInfo of the linked files needs to be updated when the original
file is deleted. I shall update the design doc to explicitly explain this part in details.

3) Currently, at least for V1, we shall support the hardlinking only for the closed files
and won't support to append operation against linked files, but it could be extended in the
future. 

4) Very good point that hardlinked files shall respect the max replication factors. From my
understanding, the setReplication is just a memory footprint update and the name node will
increase actual replication in the background.


                
> HDFS hardlink
> -------------
>
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLinks.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files to share
data without copying. Currently we will support hardlinking only closed files, but it could
be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are primarily used in
facebook:
> 1. This provides a lightweight way for applications like hbase to create a snapshot;
> 2. This also allows an application like Hive to move a table to a different directory
without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message