hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Boris Shkolnik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-343) Better Target selection for block replication
Date Wed, 26 Aug 2009 00:57:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747725#action_12747725
] 

Boris Shkolnik commented on HDFS-343:
-------------------------------------

I agree with Enis, that using random policy in highly heterogeneous clusters we may end up
with severely underutilized nodes. 
(One extreme use-case is adding a new/restored empty node to a running cluster). 
I don't know about pluggable policy, but some modification/improvements to existing one can
help.
Using probability(based on usage) instead of direct placement should address  issue  #1.
We also need to make sure that this overwrites only the "random" part of the placing policy.

> Better Target selection for block replication
> ---------------------------------------------
>
>                 Key: HDFS-343
>                 URL: https://issues.apache.org/jira/browse/HDFS-343
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Enis Soztutar
>
> Block replication policy tends to balance the number of blocks in each datanode in the
long run, however with heterogeneous clusters with varying number of disks per node, the nodes
with one disk fill quickly while nodes with 3 disks still have 60% free disk space. This also
reduces the advantage of using more than one disk for parallel IO, since machines with multiple
disks are not used as much.
> The javadoc of the ReplicationTargetChooser reads as : 
> The replica placement strategy is that if the writer is on a datanode, the 1st replica
is placed on the local machine, otherwise a random datanode. The 2nd replica is placed on
a datanode that is on a different rack. The 3rd replica is placed on a datanode which is on
the same rack as the first replica.
> I think we should switch to a policy that balances the percent of disk usage rather than
balancing total block count among the datanodes. This can be done by defining the probability
of selection of a datanode based on its disk percent usage. A formula like 1 - (percent_usage
/ 100 ) seems reasonable. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message