hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-1652) Rebalance data blocks when new data nodes added or data nodes become full
Date Wed, 05 Dec 2007 00:14:43 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12548475

Hadoop QA commented on HADOOP-1652:

-1 overall.  Here are the results of testing the latest attachment 
against trunk revision r601111.

    @author +1.  The patch does not contain any @author tags.

    javadoc +1.  The javadoc tool did not generate any warning messages.

    javac +1.  The applied patch does not generate any new compiler warnings.

    findbugs -1.  The patch appears to introduce 2 new Findbugs warnings.

    core tests +1.  The patch passed core unit tests.

    contrib tests +1.  The patch passed contrib unit tests.

Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1262/testReport/
Findbugs warnings: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1262/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1262/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/1262/console

This message is automatically generated.

> Rebalance data blocks when new data nodes added or data nodes become full
> -------------------------------------------------------------------------
>                 Key: HADOOP-1652
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1652
>             Project: Hadoop
>          Issue Type: New Feature
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>             Fix For: 0.16.0
>         Attachments: balancer.patch, balancer1.patch, balancer2.patch, balancer3.patch,
balancer4.patch, balancer5.patch, balancer6.patch, BalancerAdminGuide.pdf, BalancerAdminGuide1.pdf,
BalancerUserGuide2.pdf, RebalanceDesign4.pdf, RebalanceDesign5.pdf, RebalanceDesign6.pdf
> When a new data node joins hdfs cluster, it does not hold much data. So any map task
assigned to the machine most likely does not read local data, thus increasing the use of network
bandwidth. On the other hand, when some data nodes become full, new data blocks are placed
on only non-full data nodes, thus reducing their read parallelism. 
> This jira aims to find an approach to redistribute data blocks when imbalance occurs
in the cluster.  An solution should meet the following requirements:
> 1. It maintains data availablility guranteens in the sense that rebalancing does not
reduce the number of replicas that a block has or the number of racks that the block resides.
> 2. An adminstrator should be able to invoke and interrupt rebalancing from a command
> 3. Rebalancing should be throttled so that rebalancing does not cause a namenode to be
too busy to serve any incoming request or saturate the network.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message