Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 32DB599BC for ; Sat, 21 Apr 2012 07:17:07 +0000 (UTC) Received: (qmail 92624 invoked by uid 500); 21 Apr 2012 07:17:06 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 92582 invoked by uid 500); 21 Apr 2012 07:17:06 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 92562 invoked by uid 99); 21 Apr 2012 07:17:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Apr 2012 07:17:06 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 21 Apr 2012 07:17:00 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 82235407A9C for ; Sat, 21 Apr 2012 07:16:39 +0000 (UTC) Date: Sat, 21 Apr 2012 07:16:39 +0000 (UTC) From: "stack (JIRA)" To: issues@hbase.apache.org Message-ID: <1964550021.1273.1334992599534.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <640936386.9259.1334918317153.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258796#comment-13258796 ] stack commented on HBASE-5844: ------------------------------ If we go to a count > 100 we just continue the startup? Is that what you want? {code} + while (!tracker.checkIfBaseNodeAvailable() && ++count<100) { + Thread.sleep(100); + } {code} Be like the rest of the code regards spaces; i.e. spaces around operators... + + if (fileName==null){ Maybe you don't need deleteMyEphemeralNodeOnDisk if you instead use http://docs.oracle.com/javase/6/docs/api/java/io/File.html#deleteOnExit() inside in writeMyEphemeralNodeOnDisk? Patch looks good N. We upped the timeout because noobs would install hbase then run big mapreduce jobs w/o turning jvm and so big GCs. We figured they'd rather have their regionserver ride over the big pauses than have them be 'sensitive' out of the box. > Delete the region servers znode after a regions server crash > ------------------------------------------------------------ > > Key: HBASE-5844 > URL: https://issues.apache.org/jira/browse/HBASE-5844 > Project: HBase > Issue Type: Improvement > Components: regionserver, scripts > Affects Versions: 0.96.0 > Reporter: nkeywal > Assignee: nkeywal > Attachments: 5844.v1.patch > > > today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery process will stop only after a timeout, usually 30s. > By deleting the znode in start script, we remove this delay and the recovery starts immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira