hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ruslan Dautkhanov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8131) Implement a space balanced block placement policy
Date Mon, 24 Apr 2017 19:24:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981732#comment-15981732

Ruslan Dautkhanov commented on HDFS-8131:

Thanks for this great improvement! 
When using AvailableSpaceBlockPlacementPolicy, the default below logic does not work anymore?
1. Place the first replica somewhere – either a random rack and node (if the HDFS client
is outside the hadoop cluster) or on the local node (if the HDFS client is running on a node
inside the cluster).
2. The second replica is written to a different rack from the first, chosen at random.
3. The third replica is written to the same rack as the second replica, but on a different
4. If there are more replicas – spread them across the rest of the racks.
What is this logic now? When it comes to rackawareness and such? 
Is it by pure available space and rack awareness logic doesn't kick in?

> Implement a space balanced block placement policy
> -------------------------------------------------
>                 Key: HDFS-8131
>                 URL: https://issues.apache.org/jira/browse/HDFS-8131
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Liu Shaohui
>            Assignee: Liu Shaohui
>            Priority: Minor
>              Labels: BlockPlacementPolicy
>             Fix For: 2.8.0, 3.0.0-alpha1
>         Attachments: balanced.png, HDFS-8131.004.patch, HDFS-8131.005.patch, HDFS-8131.006.patch,
HDFS-8131-v1.diff, HDFS-8131-v2.diff, HDFS-8131-v3.diff
> The default block placement policy will choose datanodes for new blocks randomly, which
will result in unbalanced space used percent among datanodes after an cluster expansion. The
old datanodes always are in high used percent of space and new added ones are in low percent.
> Through we can used the external balance tool to balance the space used rate, it will
cost extra network IO and it's not easy to control the balance speed.
> An easy solution is to implement an balanced block placement policy which will choose
low used percent datanodes for new blocks with a little high possibility. In a not long term,
the used percent of datanodes will trend to be balanced.
> Suggestions and discussions are welcomed. Thanks

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message