Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DF8AB10301 for ; Wed, 11 Dec 2013 10:58:58 +0000 (UTC) Received: (qmail 45677 invoked by uid 500); 11 Dec 2013 10:58:28 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 45464 invoked by uid 500); 11 Dec 2013 10:58:25 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 45211 invoked by uid 99); 11 Dec 2013 10:58:18 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Dec 2013 10:58:18 +0000 Date: Wed, 11 Dec 2013 10:58:18 +0000 (UTC) From: "Hudson (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845293#comment-13845293 ] Hudson commented on HDFS-5504: ------------------------------ FAILURE: Integrated in Hadoop-Yarn-trunk #418 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/418/]) Move HDFS-5257,HDFS-5427,HDFS-5443,HDFS-5476,HDFS-5425,HDFS-5474,HDFS-5504,HDFS-5428 into branch-2.3 section. (jing9: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1550011) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt > In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode. > ------------------------------------------------------------------------------------------------ > > Key: HDFS-5504 > URL: https://issues.apache.org/jira/browse/HDFS-5504 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots > Affects Versions: 2.2.0 > Reporter: Vinay > Assignee: Vinay > Fix For: 2.3.0 > > Attachments: HDFS-5504.patch, HDFS-5504.patch > > > 1. HA installation, standby NN is down. > 2. delete snapshot is called and it has deleted the blocks from blocksmap and all datanodes. log sync also happened. > 3. before next log roll NN crashed > 4. When the namenode restartes then it will fsimage and finalized edits from shared storage and set the safemode threshold. which includes blocks from deleted snapshot also. (because this edits is not yet read as namenode is restarted before the last edits segment is not finalized) > 5. When it becomes active, it finalizes the edits and read the delete snapshot edits_op. but at this time, it was not reducing the safemode count. and it will continuing in safemode. > 6. On next restart, as the edits is already finalized, on startup only it will read and set the safemode threshold correctly. > But one more restart will bring NN out of safemode. -- This message was sent by Atlassian JIRA (v6.1.4#6159)