hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7923) The DataNodes should rate-limit their full block reports by asking the NN on heartbeat messages
Date Tue, 16 Jun 2015 20:17:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588681#comment-14588681
] 

Sanjay Radia commented on HDFS-7923:
------------------------------------

bq. This change is really helpful during startup on big clusters. In the past we have seen
restarting all the DNs at once on a several hundred node cluster bring the NN to its knees.


There is already a random backoff for the initial block report. You can configure the  initial
BR backoff time. When that jira was done there was a proposal to give each DN a different
backoff time depending on the number of outstanding BRs; this enhancement was not done at
that time because this backoff worked very well. For a several hundred node cluster the initial
BR backoff time should be approx 60sec.

> The DataNodes should rate-limit their full block reports by asking the NN on heartbeat
messages
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7923
>                 URL: https://issues.apache.org/jira/browse/HDFS-7923
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: 2.8.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: 2.8.0
>
>         Attachments: HDFS-7923.000.patch, HDFS-7923.001.patch, HDFS-7923.002.patch, HDFS-7923.003.patch,
HDFS-7923.004.patch, HDFS-7923.006.patch, HDFS-7923.007.patch
>
>
> The DataNodes should rate-limit their full block reports.  They can do this by first
sending a heartbeat message to the NN with an optional boolean set which requests permission
to send a full block report.  If the NN responds with another optional boolean set, the DN
will send an FBR... if not, it will wait until later.  This can be done compatibly with optional
fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message