hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virag Kothari (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11793) RegionStates shouldn't be locked while writing to META
Date Thu, 21 Aug 2014 00:09:28 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14104830#comment-14104830
] 

Virag Kothari commented on HBASE-11793:
---------------------------------------

Multiple ways to address this
1) Nothing to do here, let this one be contained by HBASE-11290.
2) Release lock on region states when RegionStateStore is updating META (Seems only happening
during serverOffline())

One advantage of doing 2) would be that when HBASE-11546 is added to 0.98.6, we can add this
one too. 
I am not sure whether HBASE-11290 will be completed before 0.98.6 is released.

What is your opinion [~jxiang]?




> RegionStates shouldn't be locked while writing to META
> ------------------------------------------------------
>
>                 Key: HBASE-11793
>                 URL: https://issues.apache.org/jira/browse/HBASE-11793
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>            Reporter: Virag Kothari
>            Assignee: Virag Kothari
>
> Following scenario with zk-less assignment
> Two shutdown handler threads are running where one is METAServerShutdownHandler.
> The ServershutdownHandler thread doing recovering of region server other than META acquires
lock on RegionStates while doing serverOffline() operation. It keeps the lock while its trying
to write to META (not assigned) 
> {quote}
> Thread 118 (MASTER_SERVER_OPERATIONS-gsbl90723:50510-2):
>   State: TIMED_WAITING
>   Blocked count: 430
>   Waited count: 36755
>   Stack:
>     java.lang.Object.wait(Native Method)
>     org.apache.hadoop.hbase.client.AsyncProcess.waitForNextTaskDone(AsyncProcess.java:853)
>     org.apache.hadoop.hbase.client.AsyncProcess.waitForMaximumCurrentTasks(AsyncProcess.java:879)
>     org.apache.hadoop.hbase.client.AsyncProcess.waitUntilDone(AsyncProcess.java:892)
>     org.apache.hadoop.hbase.client.HTable.backgroundFlushCommits(HTable.java:968)
>     org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:1252)
>     org.apache.hadoop.hbase.client.HTable.put(HTable.java:910)
>     org.apache.hadoop.hbase.master.RegionStateStore.updateRegionState(RegionStateStore.java:223)
>     org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:804)
>     org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:329)
>     org.apache.hadoop.hbase.master.RegionStates.updateRegionState(RegionStates.java:298)
>     org.apache.hadoop.hbase.master.RegionStates.regionOffline(RegionStates.java:449)
>     org.apache.hadoop.hbase.master.RegionStates.regionOffline(RegionStates.java:429)
>     org.apache.hadoop.hbase.master.RegionStates.serverOffline(RegionStates.java:498)
>     org.apache.hadoop.hbase.master.AssignmentManager.processServerShutdown(AssignmentManager.java:3404)
>     org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:214)
> {quote}
> In meanwhile, MetaServerShutdownHandler thread cant assign META as it is blocked on RegionStates
lock 
> {quote}
> Thread 126 (MASTER_META_SERVER_OPERATIONS-gsbl90723:50510-0):
> 	  State: BLOCKED
> 	  Blocked count: 52
> 	  Waited count: 100
> 	  Blocked on org.apache.hadoop.hbase.master.RegionStates@7398b4c1
> 	  Blocked by 118 (MASTER_SERVER_OPERATIONS-gsbl90723:50510-2)
> 	  Stack:
> 	    org.apache.hadoop.hbase.master.RegionStates.clearLastAssignment(RegionStates.java:422)
> 	    org.apache.hadoop.hbase.master.RegionStates.logSplit(RegionStates.java:418)
>  org.apache.hadoop.hbase.master.handler.MetaServerShutdownHandler.process(MetaServerShutdownHandler.java:79)
> {quote}
> As the first thread wont be able to write to META, it keeps on retrying (the retry time
is huge: hbase.client.retries.number*10) till it fails.
> During that time MetaServerShutdownHandler is blocked. 
> Also the first thread calls abort on Master as it had failed, but to aggravate the problem,
Master wont abort as it also wants to lock the RegionStates :)
> {quote}
> 	  Blocked on org.apache.hadoop.hbase.master.RegionStates@7398b4c1
> 		  Blocked by 118 (MASTER_SERVER_OPERATIONS-gsbl90723:50510-2)
> 		  Stack:
> 		    org.apache.hadoop.hbase.master.RegionStates.getRegionsInTransition(RegionStates.java:152)
> 		    org.apache.hadoop.hbase.master.AssignmentManager.updateRegionsInTransitionMetrics(AssignmentManager.java:3081)
> 		    org.apache.hadoop.hbase.master.HMaster.doMetrics(HMaster.java:751)
> 		    org.apache.hadoop.hbase.master.HMaster.loop(HMaster.java:738)
> 		    org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:607)
> {quote}
> Seems region states shouldn't be locked when IO is happening.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message