hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Yuan Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (HBASE-17055) Disabling table not getting enabled after clean cluster restart.
Date Wed, 09 Nov 2016 23:30:58 GMT

     [ https://issues.apache.org/jira/browse/HBASE-17055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stephen Yuan Jiang reassigned HBASE-17055:
------------------------------------------

    Assignee: Stephen Yuan Jiang

> Disabling table not getting enabled after clean cluster restart.
> ----------------------------------------------------------------
>
>                 Key: HBASE-17055
>                 URL: https://issues.apache.org/jira/browse/HBASE-17055
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment
>    Affects Versions: 1.3.0
>            Reporter: Y. SREENIVASULU REDDY
>            Assignee: Stephen Yuan Jiang
>             Fix For: 1.3.0
>
>
> scenario:
> 1. Disable the table, while disabling the table is in progress.
> 2. Restart whole HBase service.
> 3. Then enable the table.
> the above operation leads to RIT continously.
> pls find the below logs for understanding.
> while disabling the table whole hbase service went down.
> the following is the master logs
> {noformat}
> 2016-11-09 19:32:55,102 INFO  [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.HMaster: Client=seenu//host-1 disable testTable
> 2016-11-09 19:32:55,257 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
procedure2.ProcedureExecutor: Procedure DisableTableProcedure (table=testTable) id=8 owner=seenu
state=RUNNABLE:DISABLE_TABLE_PREPARE added to the store.
> 2016-11-09 19:32:55,264 DEBUG [ProcedureExecutor-5] lock.ZKInterProcessLockBase: Acquired
a lock for /hbase/table-lock/testTable/write-master:160000000000005
> 2016-11-09 19:32:55,285 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=8
> 2016-11-09 19:32:55,386 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=8
> 2016-11-09 19:32:55,513 INFO  [ProcedureExecutor-5] zookeeper.ZKTableStateManager: Moving
table testTable state from DISABLING to DISABLING
> 2016-11-09 19:32:55,587 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=8
> 2016-11-09 19:32:55,628 INFO  [ProcedureExecutor-5] procedure.DisableTableProcedure:
Offlining 1 regions.
> .
> .
> .
> .
> .
> .
> .
> .
> 2016-11-09 19:33:02,871 INFO  [AM.ZK.Worker-pool2-t7] master.RegionStates: Offlined 1890fa9c085dcc2ee0602f4bab069d10
from host-1,16040,1478690163056
> Wed Nov  9 19:33:02 CST 2016 Terminating master
> {noformat}
> here we need to observe
> {color:red} Offlined 1890fa9c085dcc2ee0602f4bab069d10 from host-1,16040,1478690163056
{color}
> then hmaster went down, all regionServers also made down.
> After hmaster and regionserver are restarted
> executed enable Table operation on the table.
> {panel:title=HMaster Logs|borderStyle=dashed|borderColor=#ccc|titleBGColor=#F7D6C1|bgColor=#FFFFCE}
> {noformat}
> 2016-11-09 19:49:57,059 INFO  [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.HMaster: Client=seenu//host-1 enable testTable
> 2016-11-09 19:49:57,325 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
procedure2.ProcedureExecutor: Procedure EnableTableProcedure (table=testTable) id=9 owner=seenu
state=RUNNABLE:ENABLE_TABLE_PREPARE added to the store.
> 2016-11-09 19:49:57,333 DEBUG [ProcedureExecutor-2] lock.ZKInterProcessLockBase: Acquired
a lock for /hbase/table-lock/testTable/write-master:160000000000008
> 2016-11-09 19:49:57,335 DEBUG [hconnection-0x745317ee-shared--pool3-t11] ipc.RpcClientImpl:
Use SIMPLE authentication for service ClientService, sasl=false
> 2016-11-09 19:49:57,335 DEBUG [hconnection-0x745317ee-shared--pool3-t11] ipc.RpcClientImpl:
Connecting to host-1:16040
> 2016-11-09 19:49:57,347 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:57,449 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:57,579 INFO  [ProcedureExecutor-2] procedure.EnableTableProcedure: Attempting
to enable the table testTable
> 2016-11-09 19:49:57,580 INFO  [ProcedureExecutor-2] zookeeper.ZKTableStateManager: Moving
table testTable state from DISABLED to ENABLING
> 2016-11-09 19:49:57,655 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:57,707 INFO  [ProcedureExecutor-2] procedure.EnableTableProcedure: Table
'testTable' has 1 regions, of which 1 are offline.
> 2016-11-09 19:49:57,707 INFO  [ProcedureExecutor-2] procedure.EnableTableProcedure: Bulk
assigning 1 region(s) across 1 server(s), retainAssignment=true
> 2016-11-09 19:49:57,710 DEBUG [ProcedureExecutor-2] master.GeneralBulkAssigner: Timeout-on-RIT=91000
> 2016-11-09 19:49:57,710 INFO  [host-1,16000,1478691456965-GeneralBulkAssigner-0] master.AssignmentManager:
Assigning 1 region(s) to host-1,16040,1478691644081
> {noformat}
> .
> .
> .
> {color:red}
> 2016-11-09 19:49:57,718 DEBUG [AM.-pool1-t1] master.AssignmentManager: Force region state
offline {1890fa9c085dcc2ee0602f4bab069d10 state=OFFLINE, ts=1478692197716, server=host-1,16040,1478690163056}
> 2016-11-09 19:49:57,722 INFO  [AM.-pool1-t1] master.AssignmentManager: Skip assigning
testTable,,1478689618299.1890fa9c085dcc2ee0602f4bab069d10., it is on a dead but not processed
yet server: host-1,16040,1478690163056
> {color}
> {noformat}
> 2016-11-09 19:49:57,957 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:58,459 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:49:59,462 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> {noformat}
> .
> .
> .
> {noformat}
> 2016-11-09 19:51:38,720 DEBUG [ProcedureExecutor-2] master.GeneralBulkAssigner: bulk
assigning total 1 regions to 1 servers, took 101010ms, with 1 regions still in transition
> 2016-11-09 19:51:38,728 DEBUG [ProcedureExecutor-2] procedure.EnableTableProcedure: Skipping
assign for the region {ENCODED => 1890fa9c085dcc2ee0602f4bab069d10, NAME => 'testTable,,1478689618299.1890fa9c085dcc2ee0602f4bab069d10.',
STARTKEY => '', ENDKEY => ''} during enable table testTable because its already in tranition
or assigned.
> 2016-11-09 19:51:38,728 INFO  [ProcedureExecutor-2] procedure.EnableTableProcedure: Table
'testTable' has 1 regions, of which 0 are offline.
> 2016-11-09 19:51:38,840 INFO  [ProcedureExecutor-2] zookeeper.ZKTableStateManager: Moving
table testTable state from ENABLING to ENABLED
> {noformat}
> {color:red}
> 2016-11-09 19:51:38,846 INFO  [ProcedureExecutor-2] procedure.EnableTableProcedure: Table
'testTable' was successfully enabled. 
> {color}
> {noformat}
> 2016-11-09 19:51:39,081 DEBUG [ProcedureExecutor-2] lock.ZKInterProcessLockBase: Released
/hbase/table-lock/testTable/write-master:160000000000008
> 2016-11-09 19:51:39,081 DEBUG [ProcedureExecutor-2] procedure2.ProcedureExecutor: Procedure
completed in 1mins, 41.898sec: EnableTableProcedure (table=testTable) id=9 owner=seenu state=FINISHED
> 2016-11-09 19:51:45,485 DEBUG [RpcServer.FifoWFPBQ.default.handler=49,queue=4,port=16000]
master.MasterRpcServices: Checking to see if procedure is done procId=9
> 2016-11-09 19:53:08,485 DEBUG [RpcServer.reader=4,bindAddress=host-1,port=16000] ipc.RpcServer:
RpcServer.listener,port=16000: DISCONNECTING client host-1:40313 because read count=-1. Number
of active connections: 3
> 2016-11-09 19:53:45,504 DEBUG [RpcServer.reader=3,bindAddress=host-1,port=16000] ipc.RpcServer:
RpcServer.listener,port=16000: DISCONNECTING client host-1:40281 because read count=-1. Number
of active connections: 2
> {noformat}
> {color:red}
> 2016-11-09 19:55:48,650 DEBUG [host-1,16000,1478691456965_ChoreService_1] master.HMaster:
Not running balancer because 1 region(s) in transition: {1890fa9c085dcc2ee0602f4bab069d10={1890fa9c085dcc2ee0602f4bab069d10
state=OFFLINE, ts=1478692197716, server=host-1,16040,1478690163056}}
> {color}
> {noformat}
> 2016-11-09 19:55:48,650 WARN  [host-1,16000,1478691456965_ChoreService_1] master.CatalogJanitor:
CatalogJanitor disabled! Not running scan.
> 2016-11-09 19:57:09,336 DEBUG [ProcedureExecutorTimeout] procedure2.ProcedureExecutor$CompletedProcedureCleaner:
Evict completed procedure: Procedure=EnableTableProcedure (table=testTable) (id=9, owner=seenu,
state=FINISHED, startTime=7mins, 12sec ago, lastUpdate=7mins, 12sec ago)
> {noformat}
> .
> .
> .
> {color:red}
> 2016-11-09 20:10:48,408 DEBUG [host-1,16000,1478691456965_ChoreService_1] master.HMaster:
Not running balancer because 1 region(s) in transition: {1890fa9c085dcc2ee0602f4bab069d10={1890fa9c085dcc2ee0602f4bab069d10
state=OFFLINE, ts=1478692197716, server=host-1,16040,1478690163056}}
> {color}
> {panel}
> region is continuously is in RIT, because new hmaster is trying to make the region offline
request sending to dead regionserver. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message