hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liu Shaohui (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8131) Implement a space balanced block placement policy
Date Tue, 14 Apr 2015 02:38:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14493468#comment-14493468

Liu Shaohui commented on HDFS-8131:

 In that sense, we don't want all the new blocks to be copied to the newly added nodes since
those can quickly become the bottleneck.
The new added nodes are chosen with  *a little high possibility* than the old ones. Please
look at the patch.
If there are p percent new added nodes in the cluster. In the default block placement policy,
the new added nodes will be chosen with possibility: p.  In the balanced block placement policy
in this patch,  the possibility will be q = p * p + 2* (1-p) * p * 0.6. (0.6 is configurable)
if p = 0.5, q = 0.55.  if p = 0.01, q = 0.01198. The percent of increased possibility is not
very exceeding。

I think HDFS-8041 hits a good balance.
I think there are differences between the two patches. 
HDFS-8041 blocks the datanodes to be chosen if the space used percent is large than a threshold
when the cluster is reasonably full.
At that time, I think it's maybe to late and the cluster maybe lose a half of write capability
if old nodes can not be chosen .

The aim of this patch is to make the space balanced in a long term. The new added nodes are
chosen with  *a little high possibility* than the old ones. It's a very smooth method.

> Implement a space balanced block placement policy
> -------------------------------------------------
>                 Key: HDFS-8131
>                 URL: https://issues.apache.org/jira/browse/HDFS-8131
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>            Priority: Minor
>             Fix For: 3.0.0
>         Attachments: HDFS-8131-v1.diff
> The default block placement policy will choose datanodes for new blocks randomly, which
will result in unbalanced space used percent among datanodes after an cluster expansion. The
old datanodes always are in high used percent of space and new added ones are in low percent.
> Through we can used the external balance tool to balance the space used rate, it will
cost extra network IO and it's not easy to control the balance speed.
> An easy solution is to implement an balanced block placement policy which will choose
low used percent datanodes for new blocks with a little high possibility. In a not long term,
the used percent of datanodes will trend to be balanced.
> Suggestions and discussions are welcomed. Thanks

This message was sent by Atlassian JIRA

View raw message