hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5844) Delete the region servers znode after a regions server crash
Date Thu, 29 Nov 2012 21:36:59 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506832#comment-13506832

Jean-Daniel Cryans commented on HBASE-5844:

Encountered another problem that I think I can link to this jira, I was trying to run HBase
from trunk without internet access and like in my Sept 25th comment, I get an empty line after
start-hbase.sh but now nothing is running. The .log file doesn't show anything after logging
ulimit and nothing's in the .out file. After running some bash -x, I was able to figure out
that the nohup output was being suppressed. See:

jdcryans-MBPr:hbase-github jdcryans$ ./bin/start-hbase.sh 
jdcryans-MBPr:hbase-github jdcryans$
jdcryans-MBPr:hbase-github jdcryans$ bash -x ./bin/start-hbase.sh 
... some stuff then
+ /Users/jdcryans/git/hbase-github/bin/hbase-daemon.sh start master
jdcryans-MBPr:hbase-github jdcryans$ bash -x /Users/jdcryans/git/hbase-github/bin/hbase-daemon.sh
start master
... more stuff
+ nohup /Users/jdcryans/git/hbase-github/bin/hbase-daemon.sh --config /Users/jdcryans/git/hbase-github/bin/../conf
internal_start master
jdcryans-MBPr:hbase-github jdcryans$ nohup /Users/jdcryans/git/hbase-github/bin/hbase-daemon.sh
--config /Users/jdcryans/git/hbase-github/bin/../conf internal_start master
appending output to nohup.out

So now I see that it's writing to nohup.out, which in turn tells me what really happened:

Caused by: java.lang.ClassNotFoundException: org.apache.zookeeper.KeeperException
	at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

Reproing can be done by physically deleting any jar listed in target/cached_classpath.txt.
In my case I think the jar wasn't available because I had no internet connection.

I wonder what other errors it could hide like this.
> Delete the region servers znode after a regions server crash
> ------------------------------------------------------------
>                 Key: HBASE-5844
>                 URL: https://issues.apache.org/jira/browse/HBASE-5844
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver, scripts
>    Affects Versions: 0.96.0
>            Reporter: nkeywal
>            Assignee: nkeywal
>             Fix For: 0.96.0
>         Attachments: 5844.v1.patch, 5844.v2.patch, 5844.v3.patch, 5844.v3.patch, 5844.v4.patch
> today, if the regions server crashes, its znode is not deleted in ZooKeeper. So the recovery
process will stop only after a timeout, usually 30s.
> By deleting the znode in start script, we remove this delay and the recovery starts immediately.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message