hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-4489) Use InodeID as as an identifier of a file in HDFS protocols and APIs
Date Thu, 02 May 2013 06:26:16 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13644898#comment-13644898
] 

Suresh Srinivas edited comment on HDFS-4489 at 5/2/13 6:26 AM:
---------------------------------------------------------------

Summary of results in the tests:
# File create tests- perform additional reserved name processing, inode map addition and reserved
name check. This is where maximum additional work from the patch is being done.
#* In the mirco benchmark by just calling create file related methods, the time went from
19235.8 to 19789.2 roughly 2.8% different. This can be further reduced by turning off map
to 1.3%. The patch moves splitting paths into components outside the lock. Based on this,
further optimizations are possible that improves throughput by reducing the synchronized sections.
The end result with that optimizations can make running times much smaller that what it is
today.
#* I would also point out that, this is a micro benchmark. The % difference observed in this
will be dwarfed by RPC times, network round trip time etc. Also the system will spend time
on other operations which should not be affected by this patch.
# File delete tests - performs reseved name processing and only inode map deletion.
#* There very little difference in bench mark results.
                
      was (Author: sureshms):
    Summary of results in the tests:
# File dreate tests- perform additional reserved name processing, inode map addition and reserved
name check. This is where maximum additional work from the patch is being done.
#* In the mirco benchmark by just calling create file related methods, the time went from
19235.8 to 19789.2 roughly 2.8% different. This can be further reduced by turning off map
to 1.3%. The patch moves splitting paths into components outside the lock. Based on this,
further optimizations are possible that improves throughput by reducing the synchronized sections.
The end result with that optimizations can make running times much smaller that what it is
today.
#* I would also point out that, this is a micro benchmark. The % difference observed in this
will be dwarfed by RPC times, network round trip time etc. Also the system will spend time
on other operations which should not be affected by this patch.
# File delete tests - performs reseved name processing and only inode map deletion.
#* There very little difference in bench mark results.
                  
> Use InodeID as as an identifier of a file in HDFS protocols and APIs
> --------------------------------------------------------------------
>
>                 Key: HDFS-4489
>                 URL: https://issues.apache.org/jira/browse/HDFS-4489
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Brandon Li
>            Assignee: Brandon Li
>             Fix For: 2.0.5-beta
>
>         Attachments: 4434.optimized.patch
>
>
> The benefit of using InodeID to uniquely identify a file can be multiple folds. Here
are a few of them:
> 1. uniquely identify a file cross rename, related JIRAs include HDFS-4258, HDFS-4437.
> 2. modification checks in tools like distcp. Since a file could have been replaced or
renamed to, the file name and size combination is no t reliable, but the combination of file
id and size is unique.
> 3. id based protocol support (e.g., NFS)
> 4. to make the pluggable block placement policy use fileid instead of filename (HDFS-385).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message