Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91AAC9D30 for ; Fri, 30 Mar 2012 22:37:52 +0000 (UTC) Received: (qmail 87594 invoked by uid 500); 30 Mar 2012 22:37:52 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 87569 invoked by uid 500); 30 Mar 2012 22:37:52 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 87560 invoked by uid 99); 30 Mar 2012 22:37:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 22:37:52 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 22:37:49 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id CC01834FE51 for ; Fri, 30 Mar 2012 22:37:28 +0000 (UTC) Date: Fri, 30 Mar 2012 22:37:28 +0000 (UTC) From: "Hari Mankude (Commented) (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1463769845.40415.1333147048837.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <921246741.19972.1332800666695.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HADOOP-8217) Edge case split-brain race in ZK-based auto-failover MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242818#comment-13242818 ] Hari Mankude commented on HADOOP-8217: -------------------------------------- Todd, I don't think zxid will fix the problem. Caveat is that I don't know the exact design that is being implemented here. Consider the scenario 1. ZKFC1 goes to gc sleep and loses the active lock 2. NN1 also goes to gc sleep. (NN1 was already active) 3. ZKFC2 tries to do transitionToStandby() on NN1. RPC times out. 4. Don't know what happens now in your design 5. Assuming ZKFC2 continues to make NN2 active. 6. NN1 wakes up, assumes that it is active. 7. both NN1 and NN2 are active. Without some sort of persistent fencing across all shared resources, it will not work. > Edge case split-brain race in ZK-based auto-failover > ---------------------------------------------------- > > Key: HADOOP-8217 > URL: https://issues.apache.org/jira/browse/HADOOP-8217 > Project: Hadoop Common > Issue Type: Bug > Components: auto-failover, ha > Affects Versions: 0.24.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Attachments: hadoop-8217-testcase.txt > > > As discussed in HADOOP-8206, the current design for automatic failover has the following race: > - ZKFC1 gets active lock > - ZKFC1 is about to send transitionToActive() and machine freezes (eg GC pause + swapping) > - ZKFC1 loses its ZK lock, ZKFC2 gets ZK lock > - ZKFC2 calls transitionToStandby on NN1, and transitions NN2 to active > - ZKFC1 wakes up from pause, calls transitionToActive(), now we have a bad situation > This is rare, since it requires ZKFC1 to freeze longer than its ZK session timeout, but worth fixing, since the results can be disastrous. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira