Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0D8E211D7B for ; Mon, 23 Jun 2014 10:54:25 +0000 (UTC) Received: (qmail 14563 invoked by uid 500); 23 Jun 2014 10:54:24 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 14510 invoked by uid 500); 23 Jun 2014 10:54:24 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 14497 invoked by uid 99); 23 Jun 2014 10:54:24 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Jun 2014 10:54:24 +0000 Date: Mon, 23 Jun 2014 10:54:24 +0000 (UTC) From: "Binglin Chang (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HDFS-6506) Newly moved block replica been invalidated and deleted MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Binglin Chang updated HDFS-6506: -------------------------------- Attachment: HDFS-6506.v2.patch Update patch to add fix of bug in HDFS-6586, TestBalancer is affected by balancer.id file. > Newly moved block replica been invalidated and deleted > ------------------------------------------------------ > > Key: HDFS-6506 > URL: https://issues.apache.org/jira/browse/HDFS-6506 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Binglin Chang > Assignee: Binglin Chang > Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch > > > TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently > https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/ > from the error log, the reason seems to be that newly moved block replicas been invalidated and deleted, so some work of the balancer are reversed. > {noformat} > 2014-06-06 18:15:51,681 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:51,683 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:51,683 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:51,683 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:51,682 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:54,702 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:54,702 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 to 127.0.0.1:55468 through 127.0.0.1:49159 > 2014-06-06 18:15:54,701 INFO balancer.Balancer (Balancer.java:dispatch(370)) - Successfully moved blk_1073741829_1005 with size=100 fr > 2014-06-06 18:15:54,706 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to invalidated blocks set > 2014-06-06 18:15:54,709 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to invalidated blocks set > 2014-06-06 18:15:56,421 INFO BlockStateChange (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010] > 2014-06-06 18:15:57,717 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to invalidated blocks set > 2014-06-06 18:15:57,720 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to invalidated blocks set > 2014-06-06 18:15:57,721 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to invalidated blocks set > 2014-06-06 18:15:57,722 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to invalidated blocks set > 2014-06-06 18:15:57,723 INFO BlockStateChange (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to invalidated blocks set > 2014-06-06 18:15:59,422 INFO BlockStateChange (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008] > 2014-06-06 18:16:02,423 INFO BlockStateChange (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 127.0.0.1:55468 to delete [blk_1073741845_1021] > {noformat} > Normally this should not happen, when moving a block from src to dest, replica on src should be invalided not the dest, there should be bug inside related logic. > I don't think TestBalancerWithNodeGroup#testBalancerWithNodeGroup caused this. -- This message was sent by Atlassian JIRA (v6.2#6252)