Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 37294 invoked from network); 8 Mar 2010 19:02:13 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Mar 2010 19:02:13 -0000 Received: (qmail 23134 invoked by uid 500); 8 Mar 2010 19:01:49 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 23050 invoked by uid 500); 8 Mar 2010 19:01:48 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 23042 invoked by uid 99); 8 Mar 2010 19:01:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Mar 2010 19:01:48 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Mar 2010 19:01:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 7D313234C1F0 for ; Mon, 8 Mar 2010 19:01:27 +0000 (UTC) Message-ID: <752932757.140471268074887511.JavaMail.jira@brutus.apache.org> Date: Mon, 8 Mar 2010 19:01:27 +0000 (UTC) From: "Todd Lipcon (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Created: (HDFS-1029) Image corrupt with number of files = 1 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Image corrupt with number of files = 1 -------------------------------------- Key: HDFS-1029 URL: https://issues.apache.org/jira/browse/HDFS-1029 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1 Reporter: Todd Lipcon Last week I recovered a corrupt namenode image that was completely sane except that the "number of files" in the header was set to 1, rather than the correct number (many million). The NN in question had been running for some time, so I believe the 2NN uploaded this broken image as a checkpoint. After this point, of course, no further checkpoints occurred, and the NN failed to load its image upon restart. Not sure how this happens - my only thought is that we may need to add synchronization on the nsCount field in INodeDirectoryWithQuota, but that's a long shot. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.