Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Sat, 21 Apr 2012 07:16:39 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: 
 <1964550021.1273.1334992599534.JavaMail.tomcat@hel.zones.apache.org>
In-Reply-To: 
 <640936386.9259.1334918317153.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Commented] (HBASE-5844) Delete the region servers znode
 after a regions server crash
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13258796#comment-13258796 ] 

stack commented on HBASE-5844:
------------------------------

If we go to a count > 100 we just continue the startup?  Is that what you want?

{code}
+    while (!tracker.checkIfBaseNodeAvailable() && ++count<100) {
+      Thread.sleep(100);
+    }
{code}

Be like the rest of the code regards spaces; i.e. spaces around operators...


+
+    if (fileName==null){


Maybe you don't need deleteMyEphemeralNodeOnDisk if you instead use http://docs.oracle.com/javase/6/docs/api/java/io/File.html#deleteOnExit() inside in writeMyEphemeralNodeOnDisk?

Patch looks good N.

We upped the timeout because noobs would install hbase then run big mapreduce jobs w/o turning jvm and so big GCs.  We figured they'd rather have their regionserver ride over the big pauses than have them be 'sensitive' out of the box.
                
> Delete the region servers znode after a regions server crash
> ------------------------------------------------------------
>
>                 Key: HBASE-5844
>                 URL: https://issues.apache.org/jira/browse/HBASE-5844
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, scripts
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>         Attachments: 5844.v1.patch
>
>
> today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery process will stop only after a timeout, usually 30s.
> By deleting the znode in start script, we remove this delay and the recovery starts immediately.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira