Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 61141 invoked from network); 4 Mar 2011 20:48:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Mar 2011 20:48:07 -0000 Received: (qmail 93100 invoked by uid 500); 4 Mar 2011 20:48:07 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 93076 invoked by uid 500); 4 Mar 2011 20:48:07 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 93068 invoked by uid 99); 4 Mar 2011 20:48:07 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Mar 2011 20:48:07 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Mar 2011 20:48:06 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id F30EE55538 for ; Fri, 4 Mar 2011 20:47:45 +0000 (UTC) Date: Fri, 4 Mar 2011 20:47:45 +0000 (UTC) From: "Ian Knome (JIRA)" To: issues@hbase.apache.org Message-ID: <1501882974.13.1299271665992.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1881997046.2340.1298917058029.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] Updated: (HBASE-3580) Remove RS from DeadServer when new instance checks in MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-3580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ian Knome updated HBASE-3580: ----------------------------- Attachment: HBASE-3580_-_Remove_RS_from_dead_server_when_the_RS_when_new_instance_checks_in3.patch 1. Added test cases 2. Fixed the logic as per jdcryans suggestions. > Remove RS from DeadServer when new instance checks in > ----------------------------------------------------- > > Key: HBASE-3580 > URL: https://issues.apache.org/jira/browse/HBASE-3580 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.0 > Reporter: Jean-Daniel Cryans > Fix For: 0.90.2 > > Attachments: HBASE-3580-Remove-RS-from-DeadServer-when-new-instance-checks-in.patch, HBASE-3580_-_Remove_RS_from_dead_server_when_the_RS_when_new_instance_checks_in3.patch > > > Keeping the servers in DeadServer until it reaches some maximum isn't super friendly, it confuses even the best of our users: > {quote} > 09:27 < gbowyer> Hi all, I have apparently three dead RS in my cluster, I cannot find references to them in HDFS or in ZK, how do I still report dead RS > 09:27 < gbowyer> also the same nodes are reported as live region servers > {quote} > The subtil startcode difference can be hard to catch, also this behavior differs from 0.20 (so old users get confused, like I did when debugging this problem) and it also differs from Hadoop's handling of dead DataNodes. It was introduced in HBASE-3282. > I think this should be improved by doing like Hadoop does, removing the RS from DeadServers when a new instance with the same hostname+port checks in. Stack says we should do it in ServerManager.checkIsDead -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira