hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Abhishek Kulkarni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-19287) master hangs forever if RecoverMeta send assign meta region request to target server fail
Date Sat, 31 Mar 2018 19:04:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-19287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16421431#comment-16421431
] 

Abhishek Kulkarni commented on HBASE-19287:
-------------------------------------------

2018-03-31 14:00:18,202 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=1.03
MB, freeSize=1.38 GB, max=1.38 GB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0,
cachingHits=0, cachingHitsRatio=0,evictions=3239, evicted=0, evictedPerRun=0.0
2018-03-31 14:00:18,208 INFO  [MobFileCache #0] mob.MobFileCache: MobFileCache Statistics,
access: 0, miss: 0, hit: 0, hit ratio: 0%, evicted files: 0
2018-03-31 14:00:20,763 INFO  [regionserver/abhishekk3:16020.logRoller] wal.AbstractFSWAL:
Rolled WAL /hbase/WALs/abhishekk3.pne.ven.veritas.com,16020,1522486816915/abhishekk3.pne.ven.veritas.com%2C16020%2C1522486816915.1522515620673
with entries=0, filesize=83 B; new WAL /hbase/WALs/abhishekk3.pne.ven.veritas.com,16020,1522486816915/abhishekk3.pne.ven.veritas.com%2C16020%2C1522486816915.1522519220738
2018-03-31 14:00:20,763 INFO  [regionserver/abhishekk3:16020.logRoller] wal.AbstractFSWAL:
Archiving hdfs://abhishekk1.pne.ven.veritas.com:54310/hbase/WALs/abhishekk3.pne.ven.veritas.com,16020,1522486816915/abhishekk3.pne.ven.veritas.com%2C16020%2C1522486816915.1522515620673
to hdfs://abhishekk1.pne.ven.veritas.com:54310/hbase/oldWALs/abhishekk3.pne.ven.veritas.com%2C16020%2C1522486816915.1522515620673
2018-03-31 14:05:18,202 INFO  [LruBlockCacheStatsExecutor] hfile.LruBlockCache: totalSize=1.03
MB, freeSize=1.38 GB, max=1.38 GB, blockCount=0, accesses=0, hits=0, hitRatio=0, cachingAccesses=0,
cachingHits=0, cachingHitsRatio=0,evictions=3269, evicted=0, evictedPerRun=0.0

> master hangs forever if RecoverMeta send assign meta region request to target server
fail
> -----------------------------------------------------------------------------------------
>
>                 Key: HBASE-19287
>                 URL: https://issues.apache.org/jira/browse/HBASE-19287
>             Project: HBase
>          Issue Type: Bug
>          Components: proc-v2
>    Affects Versions: 2.0.0
>            Reporter: Yi Liang
>            Assignee: Yi Liang
>            Priority: Major
>             Fix For: 2.0.0-beta-1, 2.0.0
>
>         Attachments: HBASE-19287-master-v3.patch, HBASE-19287-master-v3.patch, HBASE-19287-master-v4.patch,
hbase-19287-master-v2.patch, master.patch
>
>
> 2017-11-10 19:26:56,019 INFO  [ProcExecWrkr-1] procedure.RecoverMetaProcedure: pid=138,
state=RUNNABLE:RECOVER_META_ASSIGN_REGIONS; RecoverMetaProcedure failedMetaServer=null, splitWal=true;
Retaining meta assignment to server=hadoop-slave1.hadoop,16020,1510341981454
> 2017-11-10 19:26:56,029 INFO  [ProcExecWrkr-1] procedure2.ProcedureExecutor: Initialized
subprocedures=[{pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure
table=hbase:meta, region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454}]
> 2017-11-10 19:26:56,067 INFO  [ProcExecWrkr-2] procedure.MasterProcedureScheduler: pid=139,
ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740,
target=hadoop-slave1.hadoop,16020,1510341981454 hbase:meta hbase:meta,,1.1588230740
> 2017-11-10 19:26:56,071 INFO  [ProcExecWrkr-2] assignment.AssignProcedure: Start pid=139,
ppid=138, state=RUNNABLE:REGION_TRANSITION_QUEUE; AssignProcedure table=hbase:meta, region=1588230740,
target=hadoop-slave1.hadoop,16020,1510341981454; rit=OFFLINE, location=hadoop-slave1.hadoop,16020,1510341981454;
forceNewPlan=false, retain=false
> 2017-11-10 19:26:56,224 INFO  [ProcExecWrkr-4] zookeeper.MetaTableLocator: Setting hbase:meta
(replicaId=0) location in ZooKeeper as hadoop-slave2.hadoop,16020,1510341988652
> 2017-11-10 19:26:56,230 INFO  [ProcExecWrkr-4] assignment.RegionTransitionProcedure:
Dispatch pid=139, ppid=138, state=RUNNABLE:REGION_TRANSITION_DISPATCH; AssignProcedure table=hbase:meta,
region=1588230740, target=hadoop-slave1.hadoop,16020,1510341981454; rit=OPENING, location=hadoop-slave2.hadoop,16020,1510341988652
> 2017-11-10 19:26:56,382 INFO  [ProcedureDispatcherTimeoutThread] procedure.RSProcedureDispatcher:
Using procedure batch rpc execution for serverName=hadoop-slave2.hadoop,16020,1510341988652
version=2097152
> 2017-11-10 19:26:57,542 INFO  [main-EventThread] zookeeper.RegionServerTracker: RegionServer
ephemeral node deleted, processing expiration [hadoop-slave2.hadoop,16020,1510341988652]
> 2017-11-10 19:26:57,543 INFO  [main-EventThread] master.ServerManager: Master doesn't
enable ServerShutdownHandler during initialization, delay expiring server hadoop-slave2.hadoop,16020,1510341988652
> 2017-11-10 19:26:58,875 INFO  [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000]
master.ServerManager: Registering server=hadoop-slave1.hadoop,16020,1510342016106
> 2017-11-10 19:27:05,832 INFO  [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000]
master.ServerManager: Registering server=hadoop-slave2.hadoop,16020,1510342023184
> 2017-11-10 19:27:05,832 INFO  [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000]
master.ServerManager: Triggering server recovery; existingServer hadoop-slave2.hadoop,16020,1510341988652
looks stale, new server:hadoop-slave2.hadoop,16020,1510342023184
> 2017-11-10 19:27:05,832 INFO  [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000]
master.ServerManager: Master doesn't enable ServerShutdownHandler during initialization, delay
expiring server hadoop-slave2.hadoop,16020,1510341988652
> 2017-11-10 19:27:49,815 INFO  [RpcServer.default.FPBQ.Fifo.handler=29,queue=2,port=16000]
client.RpcRetryingCallerImpl: tarted=38594 ms ago, cancelled=false, msg=org.apache.hadoop.hbase.NotServingRegionException:
hbase:meta,,1 is not online on hadoop-slave2.hadoop,16020,1510342023184
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3290)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1370)
>         at org.apache.hadoop.hbase.regionserver.RSRpcServices.get(RSRpcServices.java:2401)
>         at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:41544)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:406)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:278)
>         at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:258)
>  row 'hbase:namespace' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-slave2.hadoop,16020,1510341988652,
seqNum=0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message