hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
Date Tue, 26 Mar 2013 23:49:15 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614692#comment-13614692
] 

Todd Lipcon commented on HDFS-4489:
-----------------------------------

bq. Inode size is ~180 bytes and this proposal adds 16-24 bytes per Inode.

How is this calculated? I see the following 5 fields:

{code}
  private byte[] name = null;
  private long permission = 0L;
  protected INodeDirectory parent = null;
  protected long modificationTime = 0L;
  protected long accessTime = 0L;
{code}

for a total of 40 bytes on a 64-bit JVM. So, adding 16-24 bytes is a pretty substantial new
memory use.

I agree with ATM that this should go on a branch since it's fairly invasive. Once the branch
is working, we can evaluate the benefit of the new feature vs the measured cost (both memory
and additional CPU to manage this new structure)
                
> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> --------------------------------------------------------------------
>
>                 Key: HDFS-4489
>                 URL: https://issues.apache.org/jira/browse/HDFS-4489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>
> The benefit of using InodeID to uniquely identify a file can be multiple folds. Here
are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been replaced or
renamed to, the file name and size combination is no t reliable, but the combination of file
id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message