hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rajeshbabu Chintaguntla (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-12901) Possible deadlock while onlining a region and get region plan for other region run parallel
Date Thu, 22 Jan 2015 02:13:35 GMT
Rajeshbabu Chintaguntla created HBASE-12901:
-----------------------------------------------

             Summary: Possible deadlock while onlining a region and get region plan for other
region run parallel
                 Key: HBASE-12901
                 URL: https://issues.apache.org/jira/browse/HBASE-12901
             Project: HBase
          Issue Type: Bug
            Reporter: Rajeshbabu Chintaguntla
            Assignee: Rajeshbabu Chintaguntla
            Priority: Critical
             Fix For: 1.0.0, 1.1.0


There is a deadlock when region state updating(regionOnline)after assignment completed and
getting region plan to other region parallelly. Before onlining we are synchronizing on regionStates
and inside synchronizing on regionPlans to clear the region plan. At the same time there is
a chance that while getting plan first we synchornize on regionPlans and then regionStates
while getting assignments of a server. This is coming after HBASE-12686 fix. This issue present
in branch-1 and branch-1.1 only. 
{code}
"AM.-pool1-t33":
	at org.apache.hadoop.hbase.master.AssignmentManager.clearRegionPlan(AssignmentManager.java:2917)
	- waiting to lock <0x00000000d0147f70> (a java.util.TreeMap)
	at org.apache.hadoop.hbase.master.AssignmentManager.regionOffline(AssignmentManager.java:3617)
	at org.apache.hadoop.hbase.master.AssignmentManager.regionOffline(AssignmentManager.java:1402)
	at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1734)
	at org.apache.hadoop.hbase.master.AssignmentManager.forceRegionStateToOffline(AssignmentManager.java:1821)
	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1456)
	at org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:45)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
"AM.-pool1-t29":
	at org.apache.hadoop.hbase.master.RegionStates.getRegionAssignments(RegionStates.java:155)
	- waiting to lock <0x00000000d010b250> (a org.apache.hadoop.hbase.master.RegionStates)
	at org.apache.hadoop.hbase.master.AssignmentManager.getSnapShotOfAssignment(AssignmentManager.java:3629)
	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.getRegionAssignmentsByServer(BaseLoadBalancer.java:1146)
	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.createCluster(BaseLoadBalancer.java:959)
	at org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.randomAssignment(BaseLoadBalancer.java:1010)
	at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:2228)
	- locked <0x00000000d0147f70> (a java.util.TreeMap)
	at org.apache.hadoop.hbase.master.AssignmentManager.getRegionPlan(AssignmentManager.java:2185)
	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1905)
	at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1464)
	at org.apache.hadoop.hbase.master.AssignCallable.call(AssignCallable.java:45)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
"AM.ZK.Worker-pool2-t41":
	at org.apache.hadoop.hbase.master.AssignmentManager.clearRegionPlan(AssignmentManager.java:2917)
	- waiting to lock <0x00000000d0147f70> (a java.util.TreeMap)
	at org.apache.hadoop.hbase.master.AssignmentManager.regionOnline(AssignmentManager.java:1305)
	at org.apache.hadoop.hbase.master.AssignmentManager$4.run(AssignmentManager.java:1196)
	- locked <0x00000000d010b250> (a org.apache.hadoop.hbase.master.RegionStates)
	at org.apache.hadoop.hbase.master.AssignmentManager$3.run(AssignmentManager.java:1142)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message