Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EB9E6DB5E for ; Wed, 28 Nov 2012 21:27:58 +0000 (UTC) Received: (qmail 93713 invoked by uid 500); 28 Nov 2012 21:27:58 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 93675 invoked by uid 500); 28 Nov 2012 21:27:58 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 93664 invoked by uid 99); 28 Nov 2012 21:27:58 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Nov 2012 21:27:58 +0000 Date: Wed, 28 Nov 2012 21:27:58 +0000 (UTC) From: "Kihwal Lee (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <823182620.35431.1354138078669.JavaMail.jiratomcat@arcas> In-Reply-To: <1846770621.35312.1354136638615.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HDFS-4233) NN keeps serving even after no journals started while rolling edit MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-4233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13505907#comment-13505907 ] Kihwal Lee commented on HDFS-4233: ---------------------------------- Since the namenode went on serving as usual without logging any transactions, they got lost after restart. (Doing saveNamespace might have done some good.) When it got restarted, there were leases that don't belong to any file due to lost state. namenode would blow up while trying to save fsimage during start-up. I had to make a hot patch to get it going, which is being formalized and improved by Daryn in HDFS-4232. > NN keeps serving even after no journals started while rolling edit > ------------------------------------------------------------------ > > Key: HDFS-4233 > URL: https://issues.apache.org/jira/browse/HDFS-4233 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 0.23.5 > Reporter: Kihwal Lee > Priority: Critical > > We've seen namenode keeps serving even after rollEditLog() failure. Instead of taking a corrective action or regard this condition as FATAL, it keeps on serving and modifying its file system state. No logs are written from this point, so if the namenode is restarted, there will be data loss. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira