hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mikhail Antonov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9738) Delete table and loadbalancer interference
Date Thu, 26 Mar 2015 23:57:56 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382993#comment-14382993

Mikhail Antonov commented on HBASE-9738:

Interesting, as I look at HMaster#balance(), we don't check for table being in {disabled,
disabling} state when computing plans at all,  and in AssignmentManager#balance we first check
table state (without grabbing any lock) and don't actually execute plans for known disabled
tables, then we only grab region lock (locker.acquireLock(encodedName);), but doesn't seem
we're getting TableLock here.

So I guess the desired behavior is that we don't allow users to disable table if some of its
regions are being balanced?

Disabling table requires writeLock, so getting readLock under balancer would be right (not
too coarse-grained?).

Alternatively, we could probably add new table state ("BALANCING"), but that seems overkill.

> Delete table and loadbalancer interference
> ------------------------------------------
>                 Key: HBASE-9738
>                 URL: https://issues.apache.org/jira/browse/HBASE-9738
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Devaraj Das
>            Priority: Critical
>             Fix For: 2.0.0, 1.1.0
> I have noticed that when the balancer is computing a plan for region moves, and a delete
table is issued, there is some interference.
> 1. At time t1, user deleted the table.
> 2. This led to the master updating the meta table to remove the line for the regioninfo
for a region f2a9e2e9d70894c03f54ee5902bebee6.
> {noformat}
> 2013-10-04 08:42:52,495 INFO  [MASTER_TABLE_OPERATIONS-hor15n05:60000-0] catalog.MetaEditor:
Deleted [{ENCODED => f2a9e2e9d70894c03f54ee5902bebee6, NAME => 'usertable,,1380876170581.f2a9e2e9d70894c03f54ee5902bebee6.',
STARTKEY => '', ENDKEY => ''}]
> {noformat}
> 3. However around the same time, the balancer kicked in, and reassigned the region and
made it online somewhere. It didn't check the fact (nor anyone else did) that the table was
indeed deleted.
> {noformat}
> 2013-10-04 08:42:53,215 INFO  [hor15n05.gq1.ygridcore.net,60000,1380869262259-BalancerChore]
master.HMaster: balance hri=usertable,,1380876170581.f2a9e2e9d70894c03f54ee5902bebee6., src=hor15n09.gq1.ygridcore.net,60020,1380869263722,
> {noformat}
> .....
> {noformat}
> 2013-10-04 08:42:53,592 INFO  [AM.ZK.Worker-pool2-t829] master.RegionStates: Onlined
f2a9e2e9d70894c03f54ee5902bebee6 on hor15n11.gq1.ygridcore.net,60020,1380869263682
> {noformat}
> 4. Henceforth, all the drop tables started giving warnings like
> {noformat}
> 2013-10-04 08:45:17,587 INFO  [RpcServer.handler=8,port=60000] master.HMaster: Client=hrt_qa//
delete usertable
> 2013-10-04 08:45:17,631 DEBUG [RpcServer.handler=8,port=60000] lock.ZKInterProcessLockBase:
Acquired a lock for /hbase/table-lock/usertable/write-master:600000000000000
> 2013-10-04 08:45:17,637 WARN  [RpcServer.handler=8,port=60000] catalog.MetaReader: No
serialized HRegionInfo in keyvalues={usertable,,1380876170581.f2a9e2e9d70894c03f54ee5902bebee6./info:seqnumDuringOpen/1380876173509/Put/vlen=8/mvcc=0,
> {noformat}
> 5. The create of the same table also fails since there is still state (reincarnated,
maybe) about the table in the master.

This message was sent by Atlassian JIRA

View raw message