hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Yates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3370) HDFS hardlink
Date Mon, 18 Jun 2012 22:18:42 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13396311#comment-13396311
] 

Jesse Yates commented on HDFS-3370:
-----------------------------------

bq. Hardlinks are of similar nature. They are hard to support if the namespace is distributed.


FWIW Ceph also punts on distributed hardlinks and just puts them into a single node "because
they are not commonly used and not likely to be hot or large" (paraphrasing). Conceptually,
you could do it with 2PC across nodes, which should be fine as long as the namespace isn't
sharded too highly - +1000s of nodes hosting hardlink information (again, not too many hardlinks).
 

>From an HBase perspective, hardlink count _could_ become large (~equal number of hfiles),
but that isn't going to be near the number of files overall currently in HDFS. Maybe punt
on the issue until it becomes a problem, keeping it flexible behind an interface?
                
> HDFS hardlink
> -------------
>
>                 Key: HDFS-3370
>                 URL: https://issues.apache.org/jira/browse/HDFS-3370
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Hairong Kuang
>            Assignee: Liyin Tang
>         Attachments: HDFS-HardLink.pdf
>
>
> We'd like to add a new feature hardlink to HDFS that allows harlinked files to share
data without copying. Currently we will support hardlinking only closed files, but it could
be extended to unclosed files as well.
> Among many potential use cases of the feature, the following two are primarily used in
facebook:
> 1. This provides a lightweight way for applications like hbase to create a snapshot;
> 2. This also allows an application like Hive to move a table to a different directory
without breaking current running hive queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message