Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 81A267F3D for ; Tue, 20 Dec 2011 01:13:55 +0000 (UTC) Received: (qmail 50677 invoked by uid 500); 20 Dec 2011 01:13:55 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 50628 invoked by uid 500); 20 Dec 2011 01:13:55 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 50620 invoked by uid 99); 20 Dec 2011 01:13:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2011 01:13:55 +0000 X-ASF-Spam-Status: No, hits=-2002.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Dec 2011 01:13:53 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 60D1911CEF3 for ; Tue, 20 Dec 2011 01:13:31 +0000 (UTC) Date: Tue, 20 Dec 2011 01:13:31 +0000 (UTC) From: "Jonathan Hsieh (Updated) (JIRA)" To: issues@hbase.apache.org Message-ID: <1303880934.29047.1324343611397.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1626791752.23610.1324176330796.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HBASE-5063) RegionServers fail to report to backup HMaster after primary goes down. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Hsieh updated HBASE-5063: ---------------------------------- Attachment: (was: hbase-5063.v2.patch) > RegionServers fail to report to backup HMaster after primary goes down. > ----------------------------------------------------------------------- > > Key: HBASE-5063 > URL: https://issues.apache.org/jira/browse/HBASE-5063 > Project: HBase > Issue Type: Bug > Affects Versions: 0.92.0 > Reporter: Jonathan Hsieh > Assignee: Jonathan Hsieh > Priority: Critical > Attachments: HBASE-5063.patch, hbase-5063.v2.0.92.patch > > > # Setup cluster with two HMasters > # Observe that HM1 is up and that all RS's are in the RegionServer list on web page. > # Kill (not even -9) the active HMaster > # Wait for ZK to time out (default 3 minutes). > # Observe that HM2 is now active. Tables may show up but RegionServers never report on web page. Existing connections are fine. New connections cannot find regionservers. > Note: > * If we replace a new HM1 in the same place and kill HM2, the cluster functions normally again after recovery. This sees to indicate that regionservers are stuck trying to talk to the old HM1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira