hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
Date Tue, 30 Apr 2013 14:32:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644907#comment-13644907
] 

Suresh Srinivas edited comment on HDFS-4489 at 4/30/13 2:31 PM:
----------------------------------------------------------------

Given the above tests, here are all the issues that are brought up:
# Introducing incompatible change
#* This is not a major incompatibility. As I said earlier, creating file or directory /.reserved
is not allowed. That said, this should get into 2.0.5 given its main goal is compatibility.
# This patch could be destabilizing
#* This patch is adding an Inode map and support for path scheme which allows addressing files
by inodes. Most of the code added in this patch is to support the new addressing mechanisms
and extensive unit tests associated with it. The regular code path should largely be unaffected
by this, with exception of adding and deleting entries in inode map. Please bring up any concerns
that I might have overlooked.
# Performance impact - based on the results, there is a very little performance impact. I
have two options:
#* The difference observed in microbenchmarks amounts to much smaller difference in a real
system. That too only associated with a few write operations such as create. Hence is it acceptable?
#* Make further optimizations to reduce synchronized section size based on the mechanism added
in this patch. [~nroberts] if you feel this is important, I will undertake the work of optimizing
this. [~daryn] also had expressed interest in it. Not sure if he has the bandwidth.

Given this, I would like to merge this in branch-2.0.5. I hope concerns expressed by people
are addressed.
                
      was (Author: sureshms):
    Given the above tests, here are all the issues that are brought up:
# Introducing incompatible change
#* This is not a major incompatibility. As I said earlier, creating file or directory /.reserved
is not allowed. That said, this should get into 2.0.5 given its main goal is compatibility.
# This patch could be destabilizing
#* This patch is adding an Inode map and support for path scheme which allows addressing files
by inodes. Most of the code added in this patch is to support the new addressing mechanisms
and extensive unit tests associated with it. The regular code path should largely be unaffected
by this, with exception of adding and deleting entries in inode map. Please bring up any concerns
that I might have overlooked.
# Performance impact - based on the results, there is a very little performance impact. I
have two options:
#* The difference observed in microbenchmarks amounts to much smaller difference in a real
system. That too only associated with a few write operations such as create. Hence is it acceptable.
#* Make further optimizations to reduce synchronized section size based on the mechanism added
in this patch. [~nroberts] if you feel this is important, I will undertake the work of optimizing
this. [~daryn] also had expressed interest in it. Not sure if he has the bandwidth.

Given this, I would like to merge this in branch-2.0.5. I hope concerns expressed by people
are addressed.
                  
> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> --------------------------------------------------------------------
>
>                 Key: HDFS-4489
>                 URL: https://issues.apache.org/jira/browse/HDFS-4489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>             Fix For: 2.0.5-beta
>
>         Attachments: 4434.optimized.patch
>
>
> The benefit of using InodeID to uniquely identify a file can be multiple folds. Here
are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been replaced or
renamed to, the file name and size combination is no t reliable, but the combination of file
id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message