hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gregory Chanan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-4470) ServerNotRunningException coming out of assignRootAndMeta kills the Master
Date Tue, 17 Jul 2012 23:32:34 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Gregory Chanan updated HBASE-4470:

    Attachment: HBASE-4470-90.patch

Here's a patch against 90.  This adds a test (which I'll forward port to 92/94/96 if this
gets +1'ed) and small fixups that are 90 specific.  The fixups cause the test to patch, when
it failed previously.

I originally had a more complex test that actually threw the exception out of getHConnection,
but this was invasive (I had to restructure the code for Mockito purposes) and it throwing
out of get may be more resilient (i.e. we'll probably call "get" when obtaining the meta location
forever, but may not always call getHConection in the future).
> ServerNotRunningException coming out of assignRootAndMeta kills the Master
> --------------------------------------------------------------------------
>                 Key: HBASE-4470
>                 URL: https://issues.apache.org/jira/browse/HBASE-4470
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.4
>            Reporter: Jean-Daniel Cryans
>            Assignee: Gregory Chanan
>            Priority: Critical
>             Fix For: 0.90.7
>         Attachments: HBASE-4470-90.patch
> I'm surprised we still have issues like that and I didn't get a hit while googling so
forgive me if there's already a jira about it.
> When the master starts it verifies the locations of root and meta before assigning them,
if the server is started but not running you'll get this:
> {quote}
> 2011-09-23 04:47:44,859 WARN org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
RemoteException connecting to RS
> org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hbase.ipc.ServerNotRunningException:
Server is not running yet
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1038)
>         at org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
>         at $Proxy6.getProtocolVersion(Unknown Source)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:419)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:393)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC.getProxy(HBaseRPC.java:444)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC.waitForProxy(HBaseRPC.java:349)
>         at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getHRegionConnection(HConnectionManager.java:969)
>         at org.apache.hadoop.hbase.catalog.CatalogTracker.getCachedConnection(CatalogTracker.java:388)
>         at org.apache.hadoop.hbase.catalog.CatalogTracker.getMetaServerConnection(CatalogTracker.java:287)
>         at org.apache.hadoop.hbase.catalog.CatalogTracker.verifyMetaRegionLocation(CatalogTracker.java:484)
>         at org.apache.hadoop.hbase.master.HMaster.assignRootAndMeta(HMaster.java:441)
>         at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:388)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:282)
> {quote}
> I hit that 3-4 times this week while debugging something else. The worst is that when
you restart the master it sees that as a failover, but none of the regions are assigned so
it takes an eternity to get back fully online.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message