Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8B3CF5124 for ; Tue, 10 May 2011 06:48:49 +0000 (UTC) Received: (qmail 22404 invoked by uid 500); 10 May 2011 06:48:46 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 22255 invoked by uid 500); 10 May 2011 06:48:45 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 22169 invoked by uid 99); 10 May 2011 06:48:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 06:48:45 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 May 2011 06:48:42 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 4CDEDC9D05 for ; Tue, 10 May 2011 06:48:03 +0000 (UTC) Date: Tue, 10 May 2011 06:48:03 +0000 (UTC) From: "Aaron T. Myers (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <642148983.34055.1305010083311.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HDFS-1378) Edit log replay should track and report file offsets in case of errors MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-1378: --------------------------------- Attachment: hdfs-1378.0.patch The patch is a little difficult to review because I indented a big block of code and {{diff}} didn't figure that out very well. The only code I changed between the {{try}} and {{catch}} was to add these two lines: {code} + recentOpcodeOffsets[numEdits % recentOpcodeOffsets.length] = + tracker.getPos(); {code} > Edit log replay should track and report file offsets in case of errors > ---------------------------------------------------------------------- > > Key: HDFS-1378 > URL: https://issues.apache.org/jira/browse/HDFS-1378 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: Aaron T. Myers > Fix For: 0.23.0 > > Attachments: hdfs-1378-branch20.txt, hdfs-1378.0.patch > > > Occasionally there are bugs or operational mistakes that result in corrupt edit logs which I end up having to repair by hand. In these cases it would be very handy to have the error message also print out the file offsets of the last several edit log opcodes so it's easier to find the right place to edit in the OP_INVALID marker. We could also use this facility to provide a rough estimate of how far along edit log replay the NN is during startup (handy when a 2NN has died and replay takes a while) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira