hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arpit Agarwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11295) Check storage remaining instead of node remaining in BlockPlacementPolicyDefault.chooseReplicaToDelete()
Date Tue, 21 Feb 2017 23:34:44 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15877015#comment-15877015
] 

Arpit Agarwal commented on HDFS-11295:
--------------------------------------

Thanks for the visualization [~elek]. This makes sense and the change looks good. Let's rename
{{DatanodeStorageInfo#setRemaining}} to {{setRemainingForTests}} and make it package-private.

A minor concern is the NameNode's mapping of (storage -> block replicas) can be temporarily
incorrect. In a healthy cluster this difference is quickly reconciled via {{BlockUnderConstructionFeature#addReplicaIfNotPresent}}.

> Check storage remaining instead of node remaining in BlockPlacementPolicyDefault.chooseReplicaToDelete()
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-11295
>                 URL: https://issues.apache.org/jira/browse/HDFS-11295
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.7.1
>            Reporter: Xiao Liang
>            Assignee: Elek, Marton
>         Attachments: HDFS-11295.001.patch, HDFS-11295.002.patch, HDFS-11295.003.patch,
HDFS-11295.jpg
>
>
> Currently in BlockPlacementPolicyDefault.chooseReplicaToDelete() the logic for choosing
replica to delete is to pick the node with the least free space(node.getRemaining()), if all
hearbeats are within the tolerable heartbeat interval.
> However, a node may have multiple storages and node.getRemaining() is a sum of the remainings
of them, if free space of the storage with the block to be delete is low, free space of the
node could still be high due to other storages of the node, finally the storage chosen may
not be the storage with least free space.
> So using storage.getRemaining() to choose a storage with least free space for choosing
replica to delete may be a better way to balance storage usage.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message