hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-10598) DiskBalancer does not execute multi-steps plan.
Date Mon, 25 Jul 2016 22:28:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-10598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15392760#comment-15392760

Wei-Chiu Chuang commented on HDFS-10598:

Hi [~eddyxu] thanks identifying the bug and submitting the patch. I think the fix is straightforward
and the unit test makes sense to me. I wonder if we need more unit tests to cover more scenarios,
because in addition to the normal operation, the patch fixes the termination condition in
these corner cases:
* {code}// Check for the max error count constraint.{code}
* {code}// we are not able to find any blocks to copy.{code}
* {code}// check if someone told us exit{code}
* {code}// Technically it is possible for us to find a smaller block and{code}

[~arpitagarwal], what's your take?

> DiskBalancer does not execute multi-steps plan.
> -----------------------------------------------
>                 Key: HDFS-10598
>                 URL: https://issues.apache.org/jira/browse/HDFS-10598
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: diskbalancer
>    Affects Versions: 3.0.0-beta1
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>            Priority: Critical
>         Attachments: HDFS-10598.00.patch
> I set up a 3 DN node cluster, each one with 2 small disks.  After creating some files
to fill HDFS, I added two more small disks to one DN.  And run the diskbalancer on this DataNode.
> The disk usage before running diskbalancer:
> {code}
> /dev/loop0  3.9G  2.1G  1.6G 58%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  17M  3.6G 1%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%  /mnt/data4
> {code}
> However, after running diskbalancer (i.e., {{-query}} shows {{PLAN_DONE}})
> {code}
> /dev/loop0  3.9G  1.2G  2.5G 32%  /mnt/data1
> /dev/loop1  3.9G  2.6G  1.1G 71%  /mnt/data2
> /dev/loop2  3.9G  953M  2.7G 26%  /mnt/data3
> /dev/loop3  3.9G  17M  3.6G 1%   /mnt/data4
> {code}
> It is suspicious that in {{DiskBalancerMover#copyBlocks}}, every return does {{this.setExitFlag}}
which prevents {{copyBlocks()}} be called multiple times from {{DiskBalancer#executePlan}}.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message