hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "yunjiong zhao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-11384) Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength spike
Date Wed, 01 Feb 2017 20:38:51 GMT

     [ https://issues.apache.org/jira/browse/HDFS-11384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

yunjiong zhao updated HDFS-11384:
---------------------------------
    Attachment: balancer.day.png
                balancer.week.png
                HDFS-11384.001.patch

This patch provide a option to let balancer blocked for $dfs.balancer.getBlocks.interval.millis
milliseconds after every getBlocks RPC call.
The attached pictures shows the improvements after I apply this patch to our production cluster
around Thursday 15:00.

> Add option for balancer to disperse getBlocks calls to avoid NameNode's rpc.CallQueueLength
spike
> -------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11384
>                 URL: https://issues.apache.org/jira/browse/HDFS-11384
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>    Affects Versions: 2.7.3
>            Reporter: yunjiong zhao
>            Assignee: yunjiong zhao
>         Attachments: balancer.day.png, balancer.week.png, HDFS-11384.001.patch
>
>
> When running balancer on hadoop cluster which have more than 3000 Datanodes will cause
NameNode's rpc.CallQueueLength spike. We observed this situation could cause Hbase cluster
failure due to RegionServer's WAL timeout.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message