Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2CD00F4C5 for ; Tue, 26 Mar 2013 23:49:16 +0000 (UTC) Received: (qmail 85357 invoked by uid 500); 26 Mar 2013 23:49:16 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 85258 invoked by uid 500); 26 Mar 2013 23:49:15 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 85248 invoked by uid 99); 26 Mar 2013 23:49:15 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Mar 2013 23:49:15 +0000 Date: Tue, 26 Mar 2013 23:49:15 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614692#comment-13614692 ] Todd Lipcon commented on HDFS-4489: ----------------------------------- bq. Inode size is ~180 bytes and this proposal adds 16-24 bytes per Inode. How is this calculated? I see the following 5 fields: {code} private byte[] name = null; private long permission = 0L; protected INodeDirectory parent = null; protected long modificationTime = 0L; protected long accessTime = 0L; {code} for a total of 40 bytes on a 64-bit JVM. So, adding 16-24 bytes is a pretty substantial new memory use. I agree with ATM that this should go on a branch since it's fairly invasive. Once the branch is working, we can evaluate the benefit of the new feature vs the measured cost (both memory and additional CPU to manage this new structure) > Use InodeID as as an identifier of a file in HDFS protocols and APIs > -------------------------------------------------------------------- > > Key: HDFS-4489 > URL: https://issues.apache.org/jira/browse/HDFS-4489 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode > Reporter: Brandon Li > Assignee: Brandon Li > > The benefit of using InodeID to uniquely identify a file can be multiple folds. Here are a few of them: > 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437. > 2. modification checks in tools like distcp. Since a file could have been replaced or renamed to, the file name and size combination is no t reliable, but the combination of file id and size is unique. > 3. id based protocol support (e.g., NFS) > 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira