hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tomasz Nykiel (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-2477) Optimize computing the diff between a block report and the namenode state.
Date Thu, 20 Oct 2011 04:10:10 GMT
Optimize computing the diff between a block report and the namenode state.

                 Key: HDFS-2477
                 URL: https://issues.apache.org/jira/browse/HDFS-2477
             Project: Hadoop HDFS
          Issue Type: Sub-task
          Components: name-node
            Reporter: Tomasz Nykiel

When a block report is processed at the NN, the BlockManager.reportDiff traverses all blocks
contained in the report, and for each one block, which is also present in the corresponding
datanode descriptor, the block is moved to the head of the list of the blocks in this datanode

With HDFS-395 the huge majority of the blocks in the report, are also present in the datanode
descriptor, which means that almost every block in the report will have to be moved to the
head of the list.

Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, which removes
a block from a list and then inserts it. In this process, we call findDatanode several times
(afair 6 times for each moveBlockToHead call). findDatanode is relatively expensive, since
it linearly goes through the triplets to locate the given datanode.

With this patch, we do some memoization of findDatanode, so we can reclaim 2 findDatanode
calls. Our experiments show that this can improve the reportDiff (which is executed under
write lock) by around 15%. Currently with HDFS-395, reportDiff is responsible for almost 100%
of the block report processing time.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message