hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3370) HDFS hardlink
Date Tue, 12 Jun 2012 09:46:43 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293480#comment-13293480
] 

Lars Hofhansl commented on HDFS-3370:
-------------------------------------

Thanks Liyin. Sounds good.

One thought that occurred to me since: We need to think about copy semantics. For example
how will distcp handle this? It shouldn't create a new copy of a file for each hardlink that
points to it, but rather just copy it at most once and create hardlinks for each following
reference. But then what about multiple distcp commands that happen to cover hardlinks to
the same file? I suppose in the case we cannot be expected to avoid multiple copies of the
same file (but at most one copy for each invocation of distcp, and only if the distcp happens
to cover a different hardlink).

                
> HDFS hardlink
> -------------
>
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLink.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files to share
data without copying. Currently we will support hardlinking only closed files, but it could
be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are primarily used in
facebook:
> 1. This provides a lightweight way for applications like hbase to create a snapshot;
> 2. This also allows an application like Hive to move a table to a different directory
without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message