hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5666) RegionServer doesn't retry to check if base node is available
Date Tue, 10 Apr 2012 18:05:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250889#comment-13250889
] 

stack commented on HBASE-5666:
------------------------------

bq. If the client comes up during this time I think that should crash anyway because the HRegion
is still in the initialize() method...

You might try it?

bq. but recoverableZookeeper.exists() retries in case of CONNECTIONLOSS, SESSIONEXPIRED and
OPERATIONTIMEOUT.

Thats fine I'd say.  We want that.  We want it to actually get through the above and get to
zk to check whether base node exists.

Otherwise I think the patch good.  Does this need to be public?  +  public boolean checkIfBaseNodeAvailable(int
timeout) {?

                
> RegionServer doesn't retry to check if base node is available
> -------------------------------------------------------------
>
>                 Key: HBASE-5666
>                 URL: https://issues.apache.org/jira/browse/HBASE-5666
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver, zookeeper
>    Affects Versions: 0.92.1, 0.94.0, 0.96.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>         Attachments: HBASE-5666-v1.patch, HBASE-5666-v2.patch, HBASE-5666-v3.patch, HBASE-5666-v4.patch,
HBASE-5666-v5.patch, HBASE-5666-v6.patch, HBASE-5666-v7.patch, hbase-1-regionserver.log, hbase-2-regionserver.log,
hbase-3-regionserver.log, hbase-master.log, hbase-regionserver.log, hbase-zookeeper.log
>
>
> I've a script that starts hbase and a couple of region servers in distributed mode (hbase.cluster.distributed
= true)
> {code}
> $HBASE_HOME/bin/start-hbase.sh
> $HBASE_HOME/bin/local-regionservers.sh start 1 2 3
> {code}
> but the region servers are not able to start...
> It seems that during the RS start the the znode is still not available, and HRegionServer.initializeZooKeeper()
check just once if the base not is available.
> {code}
> 2012-03-28 21:54:05,013 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED:
Check the value configured in 'zookeeper.znode.parent'. There could be a mismatch with the
one configured in the master.
> 2012-03-28 21:54:08,598 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING
region server localhost,60202,1332964444824: Initialization of RS failed.  Hence aborting
RS.
> java.io.IOException: Received the shutdown message while waiting.
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.blockAndCheckIfStopped(HRegionServer.java:626)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.initializeZooKeeper(HRegionServer.java:596)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.preRegistrationInitialization(HRegionServer.java:558)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672)
> 	at java.lang.Thread.run(Thread.java:662)
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message