hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4261) TestBalancerWithNodeGroup times out
Date Fri, 07 Dec 2012 21:47:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526806#comment-13526806

Chris Nauroth commented on HDFS-4261:

I reviewed the Windows failure more closely and found this:

java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN: replica.getBytesOnDisk() !=
 block.getNumBytes(), block=BP-TEST:blk_1000_2000, replica=ReplicaUnderRecovery,
 blk_1000_2000, RUR

That came from this check in {{FsDatasetImpl#updateReplicaUnderRecovery}}:

    //check replica's byte on disk
    if (replica.getBytesOnDisk() != oldBlock.getNumBytes()) {
      throw new IOException("THIS IS NOT SUPPOSED TO HAPPEN:"
          + " replica.getBytesOnDisk() != block.getNumBytes(), block="
          + oldBlock + ", replica=" + replica);

This is causing the current balancer iteration to move 0 bytes.  Then, the new logic returns
{{NO_MOVE_PROGRESS}} after exceeding the maximum iterations.

This looks to be an unrelated Windows-specific issue, so I have filed a separate jira to track
it: HDFS-4289.

> TestBalancerWithNodeGroup times out
> -----------------------------------
>                 Key: HDFS-4261
>                 URL: https://issues.apache.org/jira/browse/HDFS-4261
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 1.0.4, 1.1.1, 2.0.2-alpha
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Junping Du
>         Attachments: HDFS-4261.patch, HDFS-4261-v2.patch, HDFS-4261-v3.patch, HDFS-4261-v4.patch
> When I manually ran TestBalancerWithNodeGroup, it always timed out in my machine.  Looking
at the Jerkins report [build #3573|https://builds.apache.org/job/PreCommit-HDFS-Build/3573//testReport/org.apache.hadoop.hdfs.server.balancer/],
TestBalancerWithNodeGroup somehow was skipped so that the problem was not detected.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message