Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DCEE77C6C for ; Tue, 13 Dec 2011 03:06:07 +0000 (UTC) Received: (qmail 12013 invoked by uid 500); 13 Dec 2011 03:06:07 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 11945 invoked by uid 500); 13 Dec 2011 03:06:06 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 11910 invoked by uid 99); 13 Dec 2011 03:06:03 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Dec 2011 03:06:03 +0000 X-ASF-Spam-Status: No, hits=-2001.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Dec 2011 03:05:54 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 70B2210FECF for ; Tue, 13 Dec 2011 03:05:32 +0000 (UTC) Date: Tue, 13 Dec 2011 03:05:32 +0000 (UTC) From: "Aaron T. Myers (Updated) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1044857844.4202.1323745532463.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1824706802.52459.1323323500158.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HADOOP-7896) HA: if both NNs are in Standby mode, client needs to try failing back and forth several times with sleeps MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-7896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HADOOP-7896: ----------------------------------- Attachment: HADOOP-7896-HDFS-1623.patch Here's a patch which addresses the issue. > HA: if both NNs are in Standby mode, client needs to try failing back and forth several times with sleeps > --------------------------------------------------------------------------------------------------------- > > Key: HADOOP-7896 > URL: https://issues.apache.org/jira/browse/HADOOP-7896 > Project: Hadoop Common > Issue Type: Sub-task > Components: ipc > Affects Versions: HA Branch (HDFS-1623) > Reporter: Todd Lipcon > Assignee: Aaron T. Myers > Priority: Critical > Attachments: HADOOP-7896-HDFS-1623.patch > > > For a manual failover, there may be an intermediate state for a non-trivial amount of time where both NNs are in standby mode. Currently, the failover proxy will immediately failover on receiving this exception from the first NN, and when it hits the same exception on the second NN, it immediately fails. It should probably fail back and forth nearly indefinitely if both NNs are in Standby mode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira