Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 187CF17E9F for ; Mon, 16 Mar 2015 08:11:42 +0000 (UTC) Received: (qmail 48363 invoked by uid 500); 16 Mar 2015 08:11:38 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 48309 invoked by uid 500); 16 Mar 2015 08:11:38 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 48296 invoked by uid 99); 16 Mar 2015 08:11:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Mar 2015 08:11:38 +0000 Date: Mon, 16 Mar 2015 08:11:38 +0000 (UTC) From: "Walter Su (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-7267) TestBalancer#testUnknownDatanode occasionally fails in trunk MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-7267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362919#comment-14362919 ] Walter Su commented on HDFS-7267: --------------------------------- {quote} It is a bug in SimulatedFSDataset. We should fix it. {quote} No, it's not. {quote} That's because balancer tries to move block from DN0 to DN1 but there is no such block in DN0. {quote} Balancer will trigger _BlockManager.addToInvalidates(..)_, but will not trigger _BlockManager.removeStoredBlock(..)_. The block has been moved from DN0 to DN1, but NameName still think DN0 has the block. NameNode will not remove the block from DN0 storageInfo until the next blockReport from DN0. {color:red} In this test case, add conf.setLong(DfsConfigKeys.DFS_BLOCKREPORT__INTERVAL_MSEC_KEY, 1000L); will solve the problem.{color} The *root cause* is that Balancer calls NamenodeProtocol.getBlocks() every iteration, but NamenodeProtocol.getBlocks() returns all blocks including invalidated blocks which waited to removed from DatanodeStorageInfo. > TestBalancer#testUnknownDatanode occasionally fails in trunk > ------------------------------------------------------------ > > Key: HDFS-7267 > URL: https://issues.apache.org/jira/browse/HDFS-7267 > Project: Hadoop HDFS > Issue Type: Test > Reporter: Ted Yu > Assignee: Walter Su > Priority: Minor > Attachments: testUnknownDatanode-failed-log.html > > > In build #1907 (https://builds.apache.org/job/Hadoop-Hdfs-trunk/1907/): > {code} > REGRESSION: org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode > Error Message: > expected:<0> but was:<-3> > Stack Trace: > java.lang.AssertionError: expected:<0> but was:<-3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at org.apache.hadoop.hdfs.server.balancer.TestBalancer.testUnknownDatanode(TestBalancer.java:737) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)