hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-1295) Improve namenode restart times by short-circuiting the first block reports from datanodes
Date Mon, 18 Apr 2011 22:02:05 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Matt Foley updated HDFS-1295:

    Attachment: IBR_shortcut_v6atrunk.patch

This is a final candidate patch.  Please review.  It modifies three pairs of files:

BlockManager and DatanodeDescriptor mods are the core changes implementing Dhruba's idea to
shortcut Initial Block Reports.

DataNode and FSNamesystem small mods improve the logging of Block Report processing on both
NN and DN, so it is easier to see the improvement.

TestDatanodeBlockScanner and DFSTestUtil mods solve several problems with TestDatanodeBlockScanner,
including (a) an outright bug in blockCorruptionRecoveryPolicy() which was causing failure
of testBlockCorruptionRecoveryPolicy2(); (b) make the test run much faster; and (c) make the
test print useful information in event of a failure.

It is noted that TestDatanodeBlockScanner.testTruncatedBlockReport() showed a bug in the v4
implementation of IBR shortcuts.  As a result, I had to put corrupt block processing back
into the IBR shortcut path, rather than ignoring them.  This should not cause much change
in the perf improvement, since there should be very small numbers of corrupt blocks.

> Improve namenode restart times by short-circuiting the first block reports from datanodes
> -----------------------------------------------------------------------------------------
>                 Key: HDFS-1295
>                 URL: https://issues.apache.org/jira/browse/HDFS-1295
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.22.0
>            Reporter: dhruba borthakur
>            Assignee: Matt Foley
>             Fix For: 0.23.0
>         Attachments: IBR_shortcut_v2a.patch, IBR_shortcut_v3atrunk.patch, IBR_shortcut_v4atrunk.patch,
IBR_shortcut_v4atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v6atrunk.patch, shortCircuitBlockReport_1.txt
> The namenode restart is dominated by the performance of processing block reports. On
a 2000 node cluster with 90 million blocks,  block report processing takes 30 to 40 minutes.
The namenode "diffs" the contents of the incoming block report with the contents of the blocks
map, and then applies these diffs to the blocksMap, but in reality there is no need to compute
the "diff" because this is the first block report from the datanode.
> This code change improves block report processing time by 300%.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message