hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8818) Allow Balancer to run faster
Date Thu, 04 May 2017 14:26:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996803#comment-15996803

Daryn Sharp commented on HDFS-8818:

Rather than creating fixed thread pools which will be idle as cluster size increases, perhaps
cached thread pools that spawn dynamically would help.

The previous balancer was easy to configure.  I don't fully understand the previous design
but a simpler approach that achieves the same improvement would be returning to a single fixed
thread pool - with intelligent queuing of work.  Ie. interleaving work for all targets, with
a max queued limit, so replications are distributed evenly across nodes.  I'm assuming it
didn't do that.

Do you have HDFS-8824 in your runs? I suspect the first run has it but the second one does
bq. over time older nodes will end up with only small blocks, if it is set permanently? It
will look good for quick balancing, but may not be good in long term

Exactly.  We had to disable the feature because nodes become concentrated with small blocks.
 getBlocks becomes increasing expensive as it searches for a dwindling number of large blocks
on unbalanced nodes.  The client load increases on those nodes due to block volume.  Eventually
the balancer just plays a shell game moving the larger blocks.

The current balancer probably works great when adding nodes, but not as a continuous service.
 If not reverted, something has to be done to restore previous steady state performance.

> Allow Balancer to run faster
> ----------------------------
>                 Key: HDFS-8818
>                 URL: https://issues.apache.org/jira/browse/HDFS-8818
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer & mover
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>             Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1
>         Attachments: bal1.png, bal2.png, h8818_20150723.patch, h8818_20150727.patch,
> The original design of Balancer is intentionally to make it run slowly so that the balancing
activities won't affect the normal cluster activities and the running jobs.
> There are new use case that cluster admin may choose to balance the cluster when the
cluster load is low, or in a maintain window.  So that we should have an option to allow Balancer
to run faster.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message