hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Walter Su (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7267) TestBalancer#testUnknownDatanode occasionally fails in trunk
Date Mon, 16 Mar 2015 08:11:38 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362919#comment-14362919

Walter Su commented on HDFS-7267:

It is a bug in SimulatedFSDataset. We should fix it.
No, it's not.
That's because balancer tries to move block from DN0 to DN1 but there is no such block in
Balancer will trigger _BlockManager.addToInvalidates(..)_, but will not trigger _BlockManager.removeStoredBlock(..)_.
The block has been moved from DN0 to DN1, but NameName still think DN0 has the block. NameNode
will not remove the block from DN0 storageInfo until the next blockReport from DN0.
{color:red} In this test case, add conf.setLong(DfsConfigKeys.DFS_BLOCKREPORT__INTERVAL_MSEC_KEY,
1000L); will solve the problem.{color}
The *root cause* is that Balancer calls NamenodeProtocol.getBlocks() every iteration, but
NamenodeProtocol.getBlocks()  returns all blocks including invalidated blocks which waited
to removed from DatanodeStorageInfo.

> TestBalancer#testUnknownDatanode occasionally fails in trunk
> ------------------------------------------------------------
>                 Key: HDFS-7267
>                 URL: https://issues.apache.org/jira/browse/HDFS-7267
>             Project: Hadoop HDFS
>          Issue Type: Test
>            Reporter: Ted Yu
>            Assignee: Walter Su
>            Priority: Minor
>         Attachments: testUnknownDatanode-failed-log.html
> In build #1907 (https://builds.apache.org/job/Hadoop-Hdfs-trunk/1907/):
> {code}
> REGRESSION:  org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode
> Error Message:
> expected:<0> but was:<-3>
> Stack Trace:
> java.lang.AssertionError: expected:<0> but was:<-3>
>         at org.junit.Assert.fail(Assert.java:88)
>         at org.junit.Assert.failNotEquals(Assert.java:743)
>         at org.junit.Assert.assertEquals(Assert.java:118)
>         at org.junit.Assert.assertEquals(Assert.java:555)
>         at org.junit.Assert.assertEquals(Assert.java:542)
>         at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode(TestBalancer.java:737)
> {code}

This message was sent by Atlassian JIRA

View raw message