hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-3369) Fast block processing during name-node startup.
Date Thu, 08 May 2008 23:58:55 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Konstantin Shvachko updated HADOOP-3369:

    Attachment: fastBlockReports.patch

This patch implements the following approach:
When the name-node is in safe-mode block reports do not cause modifications of the
queues of over- and under- replicated blocks.
Verification of replication of all blocks is rather performed right before exiting the safe
Thus only those blocks that really have missing replicas will appear in the neededReplications.

In my tests this approach completes block processing almost 5 times faster than the existing
which substantially improves the total name-node startup time.

> Fast block processing during name-node startup.
> -----------------------------------------------
>                 Key: HADOOP-3369
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3369
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.17.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>             Fix For: 0.18.0
>         Attachments: fastBlockReports.patch
> The block report processing during the startup period should be optimized.
> As noted in HADOOP-3022 during cluster startup all blocks are under-replicated 
> because they have not been reported by name-nodes yet.
> Currently, we routinely move blocks to the neededReplications queue when they
> are first reported and then remove them from the list when other nodes report it.
> In ideal situation we end up adding all blocks into neededReplications queue first
> only in order to remove all of them in the end. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message