Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CD8BB9B3B for ; Wed, 13 Jun 2012 19:29:31 +0000 (UTC) Received: (qmail 28686 invoked by uid 500); 13 Jun 2012 19:26:46 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 93644 invoked by uid 500); 13 Jun 2012 19:25:43 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 76187 invoked by uid 99); 13 Jun 2012 18:11:43 -0000 Received: from issues-vm.apache.org (HELO issues-vm) (140.211.11.160) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Jun 2012 18:11:43 +0000 Received: from isssues-vm.apache.org (localhost [127.0.0.1]) by issues-vm (Postfix) with ESMTP id 09CD0141BF8 for ; Wed, 13 Jun 2012 18:11:43 +0000 (UTC) Date: Wed, 13 Jun 2012 18:11:43 +0000 (UTC) From: "Karthik Ranganathan (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <384713688.13389.1339611103041.JavaMail.jiratomcat@issues-vm> In-Reply-To: <2068361121.27846.1336151688971.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-3370) HDFS hardlink MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294597#comment-13294597 ] Karthik Ranganathan commented on HDFS-3370: ------------------------------------------- @Konstantin: << This can be modeled by symlinks on the application (HBase) level without making any changes in HDFS. >> Modeling this on top of HBase would essentially mean implementing the hardlink feature at the HBase level for all its files. This means that every application that needs a similar feature needs to use symbolic links to implement hardlinks. We have already implemented this at the underlying filesystem level for HBase backups - except that on disk/node failure, the re-replication would increase the total size of data in the cluster which was getting hard to provision. Hence the natural progression towards putting it in HDFS. > HDFS hardlink > ------------- > > Key: HDFS-3370 > URL: https://issues.apache.org/jira/browse/HDFS-3370 > Project: Hadoop HDFS > Issue Type: New Feature > Reporter: Hairong Kuang > Assignee: Liyin Tang > Attachments: HDFS-HardLink.pdf > > > We'd like to add a new feature hardlink to HDFS that allows harlinked files to share data without copying. Currently we will support hardlinking only closed files, but it could be extended to unclosed files as well. > Among many potential use cases of the feature, the following two are primarily used in facebook: > 1. This provides a lightweight way for applications like hbase to create a snapshot; > 2. This also allows an application like Hive to move a table to a different directory without breaking current running hive queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira