hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj K (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HDFS-2821) Improve the Balancer to move data from over utilized nodes to under utilized nodes using balanced nodes
Date Wed, 11 Feb 2015 09:40:12 GMT

     [ https://issues.apache.org/jira/browse/HDFS-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Devaraj K reassigned HDFS-2821:

    Assignee:     (was: Devaraj K)

> Improve the Balancer to move data from over utilized nodes to under utilized nodes using
balanced nodes
> -------------------------------------------------------------------------------------------------------
>                 Key: HDFS-2821
>                 URL: https://issues.apache.org/jira/browse/HDFS-2821
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions:, 0.24.0, 0.23.1
>            Reporter: Devaraj K
> h5.Cluster State Before Balancer Run:
> ||Node||Last Contact||Admin State||Configured||Capacity(TB)||Used(TB)||Remaining(TB)||Used(%)||Remaining(%)||Blocks||
> |xxx-x-xx-n1|0|In Service|4.25|1.76|	0.84|1.65|41.34|38.86|8465|
> |xxx-x-xx-n2|1|In Service|6.03|1.76|0.94	|3.33|29.1|55.24|8465|
> |xxx-x-xx-n3|2|In Service|6.93|1.76|0.99 |4.18|25.35|60.31|8465|
> |xxx-x-xx-n4|2|In Service|10.5|0|0.54|9.97|0|94.9|0|
> \\
> \\
> h5.Cluster State After Balancer Run:
> ||Node||Last Contact||Admin State||Configured||Capacity(TB)||Used(TB)||Remaining(TB)||Used(%)||Remaining(%)||Blocks||
> |xxx-x-xx-n1|2|In Service|4.25|0.95|0.84|2.46|22.36|57.84|4830|
> |xxx-x-xx-n2|1|In Service|6.03|1.2|0.94|3.88|19.95|64.4|5858|
> |xxx-x-xx-n3|0|In Service|6.93|1.38|0.99|4.56|19.9|65.76|6327|
> |xxx-x-xx-n4|2|In Service|10.5|1.74|0.54|8.23|16.53|78.37|8383|
> \\
> Currently balancer moves the data from over utilized nodes to the under utilized nodes
and this process continues till the cluster balanced or there is no data to move from source
to destination. In this process if some nodes usage comes to avgUtilization these will not
be participated in the balance process further.
> The above table shows the cluster usage before the balancer run and after balancer run
using 1 as threshold. After balancer completion, still n1 is over utilized and n4 is under
utilized. This may be because of n4 contains all the blocks which are present in n1.  I feel
this can be improved further by moving data from over utilized nodes to balanced nodes and
then balanced nodes to under utilized nodes.

This message was sent by Atlassian JIRA

View raw message