hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3446) ProcessServerShutdown fails if META moves, orphaning lots of regions
Date Mon, 17 Jan 2011 02:29:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12982453#action_12982453
] 

Todd Lipcon commented on HBASE-3446:
------------------------------------

After digging through the logs, I found the following:

2011-01-16 18:03:26,164 DEBUG org.apache.hadoop.hbase.master.handler.ServerShutdownHandler:
Offlined and split region usertable,user136857679,1295149082811.9f2822a04028c86813fe71264da5c167.;
checking daughter presence
2011-01-16 18:03:26,169 ERROR org.apache.hadoop.hbase.executor.EventHandler: Caught throwable
while processing event M_SERVER_SHUTDOWN
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Server not running
        at org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2360)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1754)
...
        at $Proxy6.openScanner(Unknown Source)
        at org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:260)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.isDaughterMissing(ServerShutdownHandler.java:256)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.fixupDaughter(ServerShutdownHandler.java:214)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.fixupDaughters(ServerShutdownHandler.java:196)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.processDeadRegion(ServerShutdownHandler.java:181)
        at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:151)

Neither the MetaReader code nor the ServerShutdown handler has any kind of retry/blocking
behavior built in here. So many of the regions on the server were left unassigned.

> ProcessServerShutdown fails if META moves, orphaning lots of regions
> --------------------------------------------------------------------
>
>                 Key: HBASE-3446
>                 URL: https://issues.apache.org/jira/browse/HBASE-3446
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>
> I ran a rolling restart on a 5 node cluster with lots of regions, and afterwards had
LOTS of regions left orphaned. The issue appears to be that ProcessServerShutdown failed because
the server hosting META was restarted around the same time as another server was being processed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message