Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 150C991F3 for ; Wed, 11 Jan 2012 06:13:37 +0000 (UTC) Received: (qmail 89837 invoked by uid 500); 11 Jan 2012 06:13:35 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 89485 invoked by uid 500); 11 Jan 2012 06:13:13 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 89468 invoked by uid 99); 11 Jan 2012 06:13:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jan 2012 06:13:01 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jan 2012 06:12:59 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 9BA8D144AA7 for ; Wed, 11 Jan 2012 06:12:39 +0000 (UTC) Date: Wed, 11 Jan 2012 06:12:39 +0000 (UTC) From: "Todd Lipcon (Commented) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1473893380.28912.1326262359638.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <189573628.54949.1325292930729.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-2738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183899#comment-13183899 ] Todd Lipcon commented on HDFS-2738: ----------------------------------- +1, looks good to me. Thanks for making those changes. > FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested > ----------------------------------------------------------------------------------------------------------- > > Key: HDFS-2738 > URL: https://issues.apache.org/jira/browse/HDFS-2738 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: HA branch (HDFS-1623) > Reporter: Todd Lipcon > Assignee: Aaron T. Myers > Priority: Blocker > Attachments: HDFS-2738-HDFS-1623.patch, HDFS-2738-HDFS-1623.patch, HDFS-2738-HDFS-1623.patch > > > The new code in HDFS-1580 is causing an issue with selectInputStreams in the HA context. When the active is writing to the shared edits, selectInputStreams is called on the standby. This ends up calling {{journalSet.getInputStream}} but doesn't pass the {{inProgressOk=false}} flag. So, {{getInputStream}} ends up reading and validating the in-progress stream unnecessarily. Since the validation results are no longer properly cached, {{findMaxTransaction}} also re-validates the in-progress stream, and then breaks the corruption check in this code. The end result is a lot of errors like: > 2011-12-30 16:45:02,521 ERROR namenode.FileJournalManager (FileJournalManager.java:getNumberOfTransactions(266)) - Gap in transactions, max txnid is 579, 0 txns from 578 > 2011-12-30 16:45:02,521 INFO ha.EditLogTailer (EditLogTailer.java:run(163)) - Got error, will try again. > java.io.IOException: No non-corrupt logs for txid 578 > at org.apache.hadoop.hdfs.server.namenode.JournalSet.getInputStream(JournalSet.java:229) > at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1081) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:115) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$0(EditLogTailer.java:100) > at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:154) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira