Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 30216 invoked from network); 31 Mar 2010 06:54:50 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 31 Mar 2010 06:54:50 -0000 Received: (qmail 70317 invoked by uid 500); 31 Mar 2010 06:54:50 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 70138 invoked by uid 500); 31 Mar 2010 06:54:49 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 69846 invoked by uid 99); 31 Mar 2010 06:54:48 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Mar 2010 06:54:48 +0000 X-ASF-Spam-Status: No, hits=-1181.0 required=10.0 tests=ALL_TRUSTED,AWL X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Mar 2010 06:54:47 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 566DE234C4D1 for ; Wed, 31 Mar 2010 06:54:27 +0000 (UTC) Message-ID: <86297285.598261270018467352.JavaMail.jira@brutus.apache.org> Date: Wed, 31 Mar 2010 06:54:27 +0000 (UTC) From: "stack (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Updated: (HDFS-1024) SecondaryNamenode fails to checkpoint because namenode fails with CancelledKeyException In-Reply-To: <1911863287.106541267826007502.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HDFS-1024: ------------------------ Attachment: HDFS-1024.patch.1-0.20.txt A patch for 0.20. There doesn't seem to be any reason why 0.20 doesn't have the same issue (and so in 0.20 the 2NN could hand back half an fsimage). Testing the patch now. Will reporrt if all passes. > SecondaryNamenode fails to checkpoint because namenode fails with CancelledKeyException > --------------------------------------------------------------------------------------- > > Key: HDFS-1024 > URL: https://issues.apache.org/jira/browse/HDFS-1024 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 0.20.1, 0.20.2, 0.20.3, 0.21.0, 0.22.0 > Reporter: dhruba borthakur > Assignee: Dmytro Molkov > Priority: Blocker > Fix For: 0.22.0 > > Attachments: HDFS-1024.patch, HDFS-1024.patch.1, HDFS-1024.patch.1-0.20.txt > > > The secondary namenode fails to retrieve the entire fsimage from the Namenode. It fetches a part of the fsimage but believes that it has fetched the entire fsimage file and proceeds ahead with the checkpointing. Stack traces will be attached below. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.