hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14576) Avoid block report retry and slow down namenode startup
Date Wed, 17 Jul 2019 08:59:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16886799#comment-16886799

Chen Zhang commented on HDFS-14576:

{quote}I believe there are still some points need to optimize on some scenarios, especially
for NameNode restart stage
Yep, agree with that. Do you have any insight? I'm also very interested in it.

> Avoid block report retry and slow down namenode startup
> -------------------------------------------------------
>                 Key: HDFS-14576
>                 URL: https://issues.apache.org/jira/browse/HDFS-14576
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: He Xiaoqiao
>            Assignee: He Xiaoqiao
>            Priority: Major
> During namenode startup, the load will be very high since it has to process every datanodes
blockreport one by one. If there are hundreds datanodes block reports pending process, the
issue will be more serious even #processFirstBlockReport is processed a lot more efficiently
than ordinary block reports. Then some of datanode will retry blockreport and lengthens restart
times. I think we should filter the block report request (via datanode blockreport retries)
which has be processed and return directly then shorten down restart time. I want to state
this proposal may be obvious only for large cluster.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message