Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 35E359BD3 for ; Sat, 31 Dec 2011 00:55:57 +0000 (UTC) Received: (qmail 75582 invoked by uid 500); 31 Dec 2011 00:55:55 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 75247 invoked by uid 500); 31 Dec 2011 00:55:54 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 75169 invoked by uid 99); 31 Dec 2011 00:55:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 31 Dec 2011 00:55:54 +0000 X-ASF-Spam-Status: No, hits=-2001.3 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 31 Dec 2011 00:55:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id BA449131C1C for ; Sat, 31 Dec 2011 00:55:30 +0000 (UTC) Date: Sat, 31 Dec 2011 00:55:30 +0000 (UTC) From: "Todd Lipcon (Created) (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: <1445717295.54950.1325292930764.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Created] (HDFS-2738) FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org FSEditLog.selectinputStreams is reading through in-progress streams even when non-in-progress are requested ----------------------------------------------------------------------------------------------------------- Key: HDFS-2738 URL: https://issues.apache.org/jira/browse/HDFS-2738 Project: Hadoop HDFS Issue Type: Sub-task Components: ha, name-node Affects Versions: HA branch (HDFS-1623) Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical The new code in HDFS-1580 is causing an issue with selectInputStreams in the HA context. When the active is writing to the shared edits, selectInputStreams is called on the standby. This ends up calling {{journalSet.getInputStream}} but doesn't pass the {{inProgressOk=false}} flag. So, {{getInputStream}} ends up reading and validating the in-progress stream unnecessarily. Since the validation results are no longer properly cached, {{findMaxTransaction}} also re-validates the in-progress stream, and then breaks the corruption check in this code. The end result is a lot of errors like: 2011-12-30 16:45:02,521 ERROR namenode.FileJournalManager (FileJournalManager.java:getNumberOfTransactions(266)) - Gap in transactions, max txnid is 579, 0 txns from 578 2011-12-30 16:45:02,521 INFO ha.EditLogTailer (EditLogTailer.java:run(163)) - Got error, will try again. java.io.IOException: No non-corrupt logs for txid 578 at org.apache.hadoop.hdfs.server.namenode.JournalSet.getInputStream(JournalSet.java:229) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1081) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:115) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.access$0(EditLogTailer.java:100) at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:154) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira