hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-7101) HBase stuck in Region SPLIT
Date Wed, 07 Nov 2012 19:40:13 GMT

     [ https://issues.apache.org/jira/browse/HBASE-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Lars Hofhansl updated HBASE-7101:
---------------------------------

    Fix Version/s:     (was: 0.94.4)
                   0.94.3

Pulling back into 0.94.3, so that we at least have a look before 0.94.3 goes out.
                
> HBase stuck in Region SPLIT 
> ----------------------------
>
>                 Key: HBASE-7101
>                 URL: https://issues.apache.org/jira/browse/HBASE-7101
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: Bing Jiang
>             Fix For: 0.94.3, 0.96.0
>
>
> I found this issue from a zknode which has existed for a long time in the unassigned
parent.And HMaster report warnning log increasingly.The loop log is at below. 
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468
not found on server sev0040,60020,1350378314041; failed processing
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468
from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed
its split
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Region 1a1c950ad45812d7b4b9b90ebf268468
not found on server gs-dpo-sev0040,60020,1350378314041; failed processing
> WARN org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region 1a1c950ad45812d7b4b9b90ebf268468
from server sev0040,60020,1350378314041 but it doesn't exist anymore, probably already processed
its split
> we use Hbase-0.92.1, and I trace back to the source code. HMaster AssignmentManager have
already deleted the SPLIT_Region in its memory structure,but HRegionServer SplitTransaction
has found the unassigned/parent-node existed in a transient state, precisely SplitTransaction
executes tickleNodeSplit to update a new version a little later than  AssignmentManager deleting
unassigned/parent-znode. After updating a version of the znode, it will intrigue the handleRegion
operation again, however, AssignmentManager assert that the RegionState in Memory has been
deleted, and transaction goes into a retry loop.
> In the SplitTransaction, transitionZKNode will retry tickleNodeSplit after sleeping 100ms.
In my opinion, if the time is much longger than 100ms, all the operation from AssignmentManagement
will finish off completely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message