hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8912) [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE
Date Thu, 02 Jan 2014 19:41:52 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13860653#comment-13860653
] 

Lars Hofhansl commented on HBASE-8912:
--------------------------------------

Thanks [~jxiang]. The idea was that we remove the region from RITs *before* we transition
the znode. So there should no race anymore since the master cannot know about the FAILED_OPEN
before the region was removed from RITs, and since the master also avoids concurrent assigns
of the same region now.

Cool. I'll wait for an hour or two and then commit and resolve.


> [0.94] AssignmentManager throws IllegalStateException from PENDING_OPEN to OFFLINE
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-8912
>                 URL: https://issues.apache.org/jira/browse/HBASE-8912
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Priority: Critical
>             Fix For: 0.94.16
>
>         Attachments: 8912-0.94-alt2.txt, 8912-0.94.txt, 8912-fix-race.txt, HBase-0.94
#1036 test - testRetrying [Jenkins].html, log.txt, org.apache.hadoop.hbase.catalog.TestMetaReaderEditor-output.txt
>
>
> AM throws this exception which subsequently causes the master to abort: 
> {code}
> java.lang.IllegalStateException: Unexpected state : testRetrying,jjj,1372891751115.9b828792311001062a5ff4b1038fe33b.
state=PENDING_OPEN, ts=1372891751912, server=hemera.apache.org,39064,1372891746132 .. Cannot
transit it to OFFLINE.
> 	at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
> 	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
> 	at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
> 	at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> 	at java.lang.Thread.run(Thread.java:662)
> {code}
> This exception trace is from the failing test TestMetaReaderEditor which is failing pretty
frequently, but looking at the test code, I think this is not a test-only issue, but affects
the main code path. 
> https://builds.apache.org/job/HBase-0.94/1036/testReport/junit/org.apache.hadoop.hbase.catalog/TestMetaReaderEditor/testRetrying/



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message