hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tao Jie (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-13279) Datanodes usage is imbalanced if number of nodes per rack is not equal
Date Wed, 14 Mar 2018 08:21:00 GMT

     [ https://issues.apache.org/jira/browse/HDFS-13279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Tao Jie updated HDFS-13279:
---------------------------
    Affects Version/s: 2.8.3
                       3.0.0

> Datanodes usage is imbalanced if number of nodes per rack is not equal
> ----------------------------------------------------------------------
>
>                 Key: HDFS-13279
>                 URL: https://issues.apache.org/jira/browse/HDFS-13279
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.8.3, 3.0.0
>            Reporter: Tao Jie
>            Priority: Major
>
> In a Hadoop cluster, number of nodes on a rack could be different. For example, we have
50 Datanodes in all and 15 datanodes per rack, it would remain 5 nodes on the last rack. In
this situation, we find that storage usage on the last 5 nodes would be much higher than other
nodes.
> With the default blockplacement policy, for each block, the first replication has the
same probability to write to each datanode, but the probability for the 2nd/3rd replication
to write to the last 5 nodes would much higher than to other nodes. 
> Consider we write 100 blocks to such 50 datanodes. The first rep of 100 block would distirbuted
to 50 node equally. The 2rd rep of blocks which the 1st rep is on rack1(15 reps) would send
equally to other 35 nodes and each nodes receive 0.428 rep. So does blocks on rack2 and rack3.
As a result, node on rack4(5 nodes) would receive 1.29 replications in all, while other node
would receive 0.97 reps.
> ||-||Rack1(15 nodes)||Rack2(15 nodes)||Rack3(15 nodes)||Rack4(5 nodes)||
> |From rack1|-|15/35=0.43|0.43|0.43|
> |From rack2|0.43|-|0.43|0.43|
> |From rack3|0.43|0.43|-|0.43|
> |From rack4|5/45=0.11|0.11|0.11|-|
> |Total|0.97|0.97|0.97|1.29|



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message