hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yu Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6010) Make balancer able to balance data among specified servers
Date Mon, 24 Mar 2014 14:17:43 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945142#comment-13945142

Yu Li commented on HDFS-6010:

The UT failure is caused by a bug of TestBalancer, here is detailed analysis:

Let's look into the code logic of testUnevenDistribution: If number of datanode of the mini-cluster
is 3(or larger), the replication factor will be set to 2(or more), and generateBlocks will
generate a file with it, say the block number will equal to (targetSize/replicationFactor)/blockSize.
Then distributeBlock will double the block number through below codes:
    for(int i=0; i<blocks.length; i++) {
      for(int j=0; j<replicationFactor; j++) {
        boolean notChosen = true;
        while(notChosen) {
          int chosenIndex = r.nextInt(usedSpace.length);
          if( usedSpace[chosenIndex]>0 ) {
            notChosen = false;
            usedSpace[chosenIndex] -= blocks[i].getNumBytes();
Notice that this distribution cannot prevent replicated blocks on the same datanode. And then,
while invoking the MiniDFSCluster#injectBlocks(actually SimulatedFSDataset#injectBlocks) method,
the duplicated blocks would get removed according to below code segment
  public synchronized void injectBlocks(String bpid,
      Iterable<Block> injectBlocks) throws IOException {
    ExtendedBlock blk = new ExtendedBlock();
    if (injectBlocks != null) {
      for (Block b: injectBlocks) { // if any blocks in list is bad, reject list
        if (b == null) {
          throw new NullPointerException("Null blocks in block list");
        blk.set(bpid, b);
        if (isValidBlock(blk)) {
          throw new IOException("Block already exists in  block list");
      Map<Block, BInfo> map = blockMap.get(bpid);
      if (map == null) {
        map = new HashMap<Block, BInfo>();
        blockMap.put(bpid, map);
      for (Block b: injectBlocks) {
        BInfo binfo = new BInfo(bpid, b, false);
        map.put(binfo.theBlock, binfo);
This will cause the used space less than what is expected thus cause testing failure. The
issue was hidden because *in existing tests the datanode number was never set to larger than
2*. It would be easy to reproduce the issue simply by increasing the datanode number of TestBalancer#testBalancer1Internal
from 2 to 3, like
  void testBalancer1Internal(Configuration conf) throws Exception {
        new long[] {90*CAPACITY/100, 50*CAPACITY/100, 10*CAPACITY/100},
        new long[] {CAPACITY, CAPACITY, CAPACITY},
        new String[] {RACK0, RACK1, RACK2});

I've tried to refine the distribution method, however I found it hard to make it general.
To make sure no duplicated blocks assigned to the same datanode, we must make sure the largest
distribution less than sum of the other distributions

After a second thought, I even don't think it necessary to involve replication factor into
the balancer testing. Maybe the UT designer was thinking about testing balancer manner when
there's also replication ongoing, but unfortunately the current design cannot reveal this.
So personally, I propose to always set replication factor to 1 in TestBalancer

> Make balancer able to balance data among specified servers
> ----------------------------------------------------------
>                 Key: HDFS-6010
>                 URL: https://issues.apache.org/jira/browse/HDFS-6010
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer
>    Affects Versions: 2.3.0
>            Reporter: Yu Li
>            Assignee: Yu Li
>            Priority: Minor
>              Labels: balancer
>         Attachments: HDFS-6010-trunk.patch
> Currently, the balancer tool balances data among all datanodes. However, in some particular
case, we would need to balance data only among specified nodes instead of the whole set.
> In this JIRA, a new "-servers" option would be introduced to implement this.

This message was sent by Atlassian JIRA

View raw message