hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8824) Do not use small blocks for balancing the cluster
Date Fri, 18 Nov 2016 19:31:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677540#comment-15677540
] 

Kihwal Lee commented on HDFS-8824:
----------------------------------

While this will initially increase the efficiency of balancing, it is not without a negative
side-effect.

Older nodes in a cluster will slowly filled with smaller blocks as time goes on. This is accelerated
if the cluster is heterogeneous.  The smaller nodes will fill up more quickly/frequently and
the balancer will move only big blocks out of those nodes.  As more balacing happens, those
nodes will contain more and more small blocks. If sufficient time passes, the blocks on those
nodes will almost entirely small.

This feature can be enabled for quickly resolving a storage balance issue, but long-term use
can have unintended side-effect.  Fortunately, we have not released any  (other than alpha)
with this feature. We can include more information in the release note and/or address the
issue in the code/config.

> Do not use small blocks for balancing the cluster
> -------------------------------------------------
>
>                 Key: HDFS-8824
>                 URL: https://issues.apache.org/jira/browse/HDFS-8824
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer & mover
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>             Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1
>
>         Attachments: h8824_20150727b.patch, h8824_20150811b.patch
>
>
> Balancer gets datanode block lists from NN and then move the blocks in order to balance
the cluster.  It should not use the blocks with small size since moving the small blocks generates
a lot of overhead and the small blocks do not help balancing the cluster much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message