hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virag Kothari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12667) Deadlock in AssignmentManager
Date Wed, 10 Dec 2014 09:27:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14240849#comment-14240849
] 

Virag Kothari commented on HBASE-12667:
---------------------------------------

AssignmentManager.getSnapShotOfAssignment no longer exists in latest 0.98, so this specific
deadlock issue may not happen. However, there could be some other scenario where such type
of deadlock can occur. HBASE-11290 removes the coarse locking on region states and region
plans and will avoid such deadlocks by acquiring region locks. So maybe we can let this JIRA
be incorporated by HBASE-11290.

> Deadlock in AssignmentManager
> -----------------------------
>
>                 Key: HBASE-12667
>                 URL: https://issues.apache.org/jira/browse/HBASE-12667
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.98.0
>            Reporter: zhaoyunjiong
>
> No order between regionPlans and regionStates caused dead lock.
> Trunk don't have the problem since it's already got refactor.
> "master:phxhshdc11en0004:60000":
>         at org.apache.hadoop.hbase.master.AssignmentManager.clearRegionPlan(AssignmentManager.java:2898)
>         - waiting to lock <0x000000048cefe520> (a java.util.TreeMap)
>         at org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1286)
>         at org.apache.hadoop.hbase.master.AssignmentManager.handleRegionSplitting(AssignmentManager.java:3552)
>         - locked <0x000000048cf6fc10> (a org.apache.hadoop.hbase.master.RegionStates)
>         at org.apache.hadoop.hbase.master.AssignmentManager.processRegionsInTransition(AssignmentManager.java:732)
>         at org.apache.hadoop.hbase.master.AssignmentManager.processRegionInTransition(AssignmentManager.java:601)
>         at org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRecoverLostRegions(AssignmentManager.java:2851)
>         at org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRegionsInTransition(AssignmentManager.java:519)
>         at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:459)
>         at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:900)
>         at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:609)
>         at java.lang.Thread.run(Thread.java:744)
> "AM.-pool1-t10":
>         at org.apache.hadoop.hbase.master.RegionStates.getRegionAssignments(RegionStates.java:154)
>         - waiting to lock <0x000000048cf6fc10> (a org.apache.hadoop.hbase.master.RegionStates)
>         at org.apache.hadoop.hbase.master.AssignmentManager.getSnapShotOfAssignment(AssignmentManager.java:3610)
>         at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.getRegionAssignmentsByServer(BaseLoadBalancer.java:1146)
>         at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:959)
>         at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.randomAssignment(BaseLoadBalancer.java:1010)
>         at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:2209)
>         - locked <0x000000048cefe520> (a java.util.TreeMap)
>         at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:2166)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1886)
>         at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1445)
>         at org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:45)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:744)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message