hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Samir Ahmic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE
Date Fri, 18 Oct 2013 14:14:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13799126#comment-13799126
] 

Samir Ahmic commented on HBASE-8912:
------------------------------------

Look like there are multiple scenarios for triggering that AssignmentManager throws IllegalStateException
from PENDING_OPEN to OFFLINE. Here is my case (hbase-0.94.6.1): we have updated configuration
on two RS (5 total in cluster) to wrong value of hbase.client.keyvalue.maxsize (instead of
-1 value was set to 1) after restarting cluster regionservers with wrong value started to
throwing exception like this:
{code}
2013-10-14 06:53:22,267 WARN org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler:
Exception running postOpenDeployTasks; region=59badb0e2a41e7831162654227d32049
java.lang.IllegalArgumentException: KeyValue size too large
{code}
 which led to:
{code}
2013-10-14 06:53:22,271 INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed third_party
_storages,,1342430453242.59badb0e2a41e7831162654227d32049.
{code}
At same time AssignmentManger tried to reassign problematic region to other regionservers
and after one more failed attempt finally hit server with correct value of hbase.client.keyvalue.maxsize
and here is relevant log from that RS:

{code}
2013-10-14 06:53:26,390 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e31052be39645
Attempting to transition node 59badb0e2a41e7831162654227d32049 from M_ZK_REGION_OFFLINE to
RS_ZK_REGION_OPENING
2013-10-14 06:53:26,397 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e31052be39645
Successfully transitioned node 59badb0e2a41e7831162654227d32049 from M_ZK_REGION_OFFLINE to
RS_ZK_REGION_OPENING
........................

2013-10-14 06:53:26,459 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: regionserver:60020-0x13e31052be39645
Successfully transitioned node 59badb0e2a41e7831162654227d32049 from RS_ZK_
REGION_OPENING to RS_ZK_REGION_OPENED
{code}

At same time AssignmetManager throws this exception and aborted:
{code}
2013-10-14 06:53:26,400 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state : third_party_storages,,1342430453242.59badb0e
2a41e7831162654227d32049. state=PENDING_OPEN, ts=1381748006399, server=rsdfw-10-177-161-197,60020,1381747996145..
Cannot transit it to OFFLINE.
java.lang.IllegalStateException: Unexpected state : third_party_storages,,1342430453242.59badb0e2a41e7831162654227d32049.
state=PENDING_OPEN, ts=1381748006399, server=rsdfw-10-177-161-197.internal.personal.com,60020,1381747996145
.. Cannot transit it to OFFLINE.
        at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1820)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1659)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
        at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
        at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
        at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:619)
2013-10-14 06:53:26,400 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
{code}

Hope this will help.



> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-8912
>                 URL: https://issues.apache.org/jira/browse/HBASE-8912
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>             Fix For: 0.94.13
>
>         Attachments: HBase-0.94 #1036 test - testRetrying [Jenkins].html
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b.
state=PENDING_OPEN, ts=1372891751912, server=hemera.apache.org,39064,1372891746132 .. Cannot
transit it to OFFLINE.
> 	at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
> 	at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is failing pretty
frequently, but looking at the test code, I think this is not a test-only issue, but affects
the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message