hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-10136) the table-lock of TableEventHandler is released too early because reOpenAllRegions() is asynchronous
Date Tue, 17 Dec 2013 21:06:07 GMT

     [ https://issues.apache.org/jira/browse/HBASE-10136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell updated HBASE-10136:
-----------------------------------

    Affects Version/s:     (was: 0.98.1)
                       0.98.0

> the table-lock of TableEventHandler is released too early because reOpenAllRegions()
is asynchronous
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-10136
>                 URL: https://issues.apache.org/jira/browse/HBASE-10136
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.98.0, 0.96.0, 0.99.0
>            Reporter: Aleksandr Shulman
>            Assignee: Matteo Bertozzi
>              Labels: online_schema_change
>         Attachments: HBASE-10136-trunk.patch, HBASE-10136-v0.patch
>
>
> Expected behavior:
> With the introduction of the table-lock, a user can issue a request for a snapshot of
a table while that table is undergoing an online schema change and expect that snapshot request
to complete correctly. Also, the same is true if a user issues a online schema change request
while a snapshot attempt is ongoing.
> Observed behavior:
> Snapshot attempts time out when there is an ongoing online schema change because the
table lock is not acquired by anyone else and the regions are closed and opened during the
snapshot. 
> TableEventHandler trace
> {code}
> // 1. client.addColumn() call from client...
> // 2. The operation is now on the master
> 2013-12-12 12:09:57,613 DEBUG [MASTER] lock.ZKInterProcessLockBase: Acquired a lock for
/hbase/table-lock/TestTable/write-master:452010000000001
> 2013-12-12 12:09:57,640 INFO  [MASTER] handler.TableEventHandler: Handling table operation
C_M_ADD_FAMILY on table TestTable
> 2013-12-12 12:09:57,685 INFO  [MASTER] master.MasterFileSystem: AddColumn. Table = TestTable
HCD = {NAME => 'x-1386850197327', DATA_BLOCK_ENCODING => 'NONE',$
> 2013-12-12 12:09:57,693 INFO  [MASTER] handler.TableEventHandler: Bucketing regions by
region server...
> ...
> 2013-12-12 12:09:57,771 INFO  [MASTER] handler.TableEventHandler: Completed table operation
C_M_ADD_FAMILY on table TestTable
> 2013-12-12 12:09:57,771 DEBUG [MASTER] master.AssignmentManager: Starting unassign of
TestTable,,1386849056038.854b280$
> 2013-12-12 12:09:57,772 DEBUG [MASTER] lock.ZKInterProcessLockBase: Released /hbase/table-lock/TestTable/write-master:452010000000001
> // 3. The Table*Handler operation is now completed, and the client notified with "I'm
done!"
> // 4. Now the BulkReopen is starting doing the reopen
> 2013-12-12 12:09:57,772 INFO  [MASTER] master.RegionStates: Transitioned {854b280006aec464083778a5cb5f5456
state=OPEN,$
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message