Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 00FBE93C3 for ; Fri, 30 Dec 2011 23:33:55 +0000 (UTC) Received: (qmail 13744 invoked by uid 500); 30 Dec 2011 23:33:54 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 13686 invoked by uid 500); 30 Dec 2011 23:33:54 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 13563 invoked by uid 99); 30 Dec 2011 23:33:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Dec 2011 23:33:54 +0000 X-ASF-Spam-Status: No, hits=-2001.3 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Dec 2011 23:33:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F31D1131380 for ; Fri, 30 Dec 2011 23:33:30 +0000 (UTC) Date: Fri, 30 Dec 2011 23:33:30 +0000 (UTC) From: "Aaron T. Myers (Updated) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <66280852.54795.1325288010997.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <286931372.32898.1324421490691.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HDFS-2709) HA: Appropriately handle error conditions in EditLogTailer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-2709: --------------------------------- Attachment: HDFS-2709-HDFS-1623.patch Here's a patch which addresses the first two points from above. I'm still working on adding some tests for {{TestFileJournalManager}}, but this is worth reviewing in the mean time. > HA: Appropriately handle error conditions in EditLogTailer > ---------------------------------------------------------- > > Key: HDFS-2709 > URL: https://issues.apache.org/jira/browse/HDFS-2709 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: HA branch (HDFS-1623) > Reporter: Todd Lipcon > Assignee: Aaron T. Myers > Priority: Critical > Attachments: HDFS-2709-HDFS-1623.patch, HDFS-2709-HDFS-1623.patch, HDFS-2709-HDFS-1623.patch, HDFS-2709-HDFS-1623.patch > > > Currently if the edit log tailer experiences an error replaying edits in the middle of a file, it will go back to retrying from the beginning of the file on the next tailing iteration. This is incorrect since many of the edits will have already been replayed, and not all edits are idempotent. > Instead, we either need to (a) support reading from the middle of a finalized file (ie skip those edits already applied), or (b) abort the standby if it hits an error while tailing. If "a" isn't simple, let's do "b" for now and come back to 'a' later since this is a rare circumstance and better to abort than be incorrect. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira