Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AF718200BC6 for ; Sun, 20 Nov 2016 08:08:01 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A1C87160B18; Sun, 20 Nov 2016 07:08:01 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E6EFC160AF1 for ; Sun, 20 Nov 2016 08:08:00 +0100 (CET) Received: (qmail 7139 invoked by uid 500); 20 Nov 2016 07:07:59 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 7111 invoked by uid 99); 20 Nov 2016 07:07:59 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 20 Nov 2016 07:07:59 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A5ED92C4C70 for ; Sun, 20 Nov 2016 07:07:59 +0000 (UTC) Date: Sun, 20 Nov 2016 07:07:59 +0000 (UTC) From: "Wei-Chiu Chuang (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HDFS-11155) VolumeScanner should report the latest generation stamp of a bad replica MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Sun, 20 Nov 2016 07:08:01 -0000 [ https://issues.apache.org/jira/browse/HDFS-11155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HDFS-11155. ------------------------------------ Resolution: Not A Problem It turns out the symptom described in this jira is part of HDFS-11160, which is the root cause of the symptom. So close this in order to concentrate my fix on HDFS-11160. > VolumeScanner should report the latest generation stamp of a bad replica > ------------------------------------------------------------------------ > > Key: HDFS-11155 > URL: https://issues.apache.org/jira/browse/HDFS-11155 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.7.4 > Environment: CDH5.7.3 > Reporter: Wei-Chiu Chuang > Assignee: Wei-Chiu Chuang > > HDFS-10512 fixed a race condition that caused VolumeScanner to terminate abruptly when a corrupt replica, which is being updated, is detected. However, when such a corrupt replica is detected, VolumeScanner still reports the old replica generation stamp to the NN. NN then directs DN to remove the older replica. Because the generation stamp is updated, DN can not find it, so corrupt replica remains corrupt. > NN's log shows something similar to the following: > {quote} > 2016-11-17 21:08:05,350 INFO BlockStateChange: BLOCK NameSystem.addToCorruptReplicasMap: blk_1077571736 added as corrupt on 192.168.168.58:50010 by /192.168.168.58 because client machine reported it > 2016-11-17 21:08:05,350 INFO BlockStateChange: BLOCK* invalidateBlock: blk_1077571736_3991953(stored=blk_1077571736_3992018) on 192.168.168.58:50010 > {quote} > The DN's log has these: > {noformat} > 2016-11-17 21:08:04,815 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Appending to FinalizedReplica, blk_1077571736_3991953, FINALIZED > getNumBytes() = 39061752 > getBytesOnDisk() = 39061752 > getVisibleLength()= 39061752 > getVolume() = /data/3/dfs/dn/current > getBlockFile() = /data/3/dfs/dn/current/BP-1092022411-192.168.168.55-1474407949037/current/finalized/subdir58/subdir112/blk_1077571736 > 2016-11-17 21:08:09,158 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to delete replica blk_1077571736_3991953: ReplicaInfo not found. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org