hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benoy Antony (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
Date Tue, 28 Feb 2017 19:55:45 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888766#comment-15888766
] 

Benoy Antony commented on HDFS-11384:
-------------------------------------

[~zhaoyunjiong],
 If there are blocks to balance, then there will be sufficient delays between successive getBlocks.

In such cases, we do not have to sleep. 
It will be better to keep track of the interval between successive getBlocks and sleep only
for the required time.
Can you also write a unit test to cover this change ?


> Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength
spike
> -------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11384
>                 URL: https://issues.apache.org/jira/browse/HDFS-11384
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>    Affects Versions: 2.7.3
>            Reporter: yunjiong zhao
>            Assignee: yunjiong zhao
>         Attachments: balancer.day.png, balancer.week.png, HDFS-11384.001.patch
>
>
> When running balancer on hadoop cluster which have more than 3000 Datanodes will cause
NameNode's rpc.CallQueueLength spike. We observed this situation could cause Hbase cluster
failure due to RegionServer's WAL timeout.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message