hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-4306) Race between CatalogJanitor and LoadBalancer
Date Wed, 24 Dec 2014 19:50:13 GMT

     [ https://issues.apache.org/jira/browse/HBASE-4306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell resolved HBASE-4306.
-----------------------------------
    Resolution: Invalid

Wasn't clearly diagnosed in the first place, marking as Invalid

> Race between CatalogJanitor and LoadBalancer
> --------------------------------------------
>
>                 Key: HBASE-4306
>                 URL: https://issues.apache.org/jira/browse/HBASE-4306
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.4
>            Reporter: Jean-Daniel Cryans
>            Priority: Minor
>
> It is possible for the LoadBalancer to try to assign an offline/split region while it
is waiting to be CatalogJanitor'ed. It goes like this:
> {quote}
> 2011-08-25 00:32:07,137 INFO org.apache.hadoop.hbase.master.ServerManager: Received REGION_SPLIT:
parent: Daughters; d1, d2 from sv4r22s16,60020,1314211225331
> ...
> (cleaning never happens or whatever)
> ...
> 2011-08-29 13:45:14,561 INFO org.apache.hadoop.hbase.master.HMaster: balance hri=parent,
src=sv4r22s16,60020,1314211225331, dest=sv4r19s17,60020,1314218170402
> 2011-08-29 13:45:14,561 DEBUG org.apache.hadoop.hbase.master.AssignmentManager: Starting
unassignment of region parent (offlining)
> 2011-08-29 13:45:14,588 INFO org.apache.hadoop.hbase.master.AssignmentManager: Server
serverName=sv4r22s16,60020,1314211225331, load=(requests=0, regions=0, usedHeap=0, maxHeap=0)
returned org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException:
Received close for parent but we are not serving it for parent
> {quote}
> Here it took 4 days of balancing to finally get to try to balance the parent (that was
never deleted because of HBASE-4238), but it can also happen if the balancer decides to balance
the parent just before it's cleaned. The end effect is that the balancer will be disabled
_forever_ until that's fixed.
> The culprit here is that the master keeps the region "online" until AssignmentManager.regionOffline
is called by the CJ, which means it's still treated like any other region although it's offline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message