hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From おぎばやしひろのり <ogibaya...@gmail.com>
Subject Re: HBase Cluster not responding after shutting down one of slave nodes
Date Thu, 18 Feb 2016 10:04:10 GMT
Samir,

Thank you for your advice.
The actual operation I had to shutdown host7516 was, just click "stop"
button on our VM console. I don't know the internal, but I can see
some process terminating messages in /var/log/messages, so it looks
like a kind of graceful shutdown.
About reproducing, yes, I can, but not every time. maybe once in 3,4
times or so. I am not sure the difference.

I also checked the regionserver logs. There ware warnings related to
DataNode access, this is because there was DataNode on the shutdown
host7516, too.
Regarding log split, I couldn't find any errors. Both host finished
the task within a few seconds.

--- host7517
2016-02-17 15:39:59,002 INFO  [LruBlockCacheStatsExecutor]
hfile.LruBlockCache: totalSize=420.92 KB, freeSize=399.19 MB,
max=399.60 MB, blockCount=0, accesses=24227, hits=0, hitRatio=0,
cachingAccesses=0, cachingHits=0, cachingHitsRatio=0
,evictions=7289, evicted=0, evictedPerRun=0.0
2016-02-17 15:39:59,869 INFO  [HOST7517:16020Replication Statistics
#0] regionserver.Replication: Normal source for cluster 1: Total
replicated edits: 0, currently replicating from:
hdfs://hdpts/apps/hbase/data/WALs/host7517.mydomain,16020,1455618296942/host7517.mydomain%2C16020%2C1455618296942.default.1455690301451
at position: 96609590

2016-02-17 15:40:50,953 WARN  [ResponseProcessor for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753847_13125]
hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753847_13125
java.io.IOException: Bad response ERROR for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753847_13125 from
datanode DatanodeInfoWithStorage[192.168.184.73:50010,DS-9787b201-fc64-450e-a20f-dcc79fb94b6f,DISK]
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:786)
2016-02-17 15:40:50,954 WARN  [DataStreamer for file
/apps/hbase/data/WALs/host7517.mydomain,16020,1455618296942/host7517.mydomain%2C16020%2C1455618296942.default.1455690301451
block BP-1843495860-192.168.189.219-1453778090403:blk_1073753847_13125]
hdfs.DFSClient: Error Recovery for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753847_13125 in
pipeline DatanodeInfoWithStorage[192.168.185.249:50010,DS-52f00167-e863-48d8-8f98-3b613f774b0c,DISK],
DatanodeInfoWithStorage[192.168.184.85:50010,DS-f8c4c3b2-a7cc-40e0-9ebb-2b9c352c0d28,DISK],
DatanodeInfoWithStorage[192.168.184.73:50010,DS-9787b201-fc64-450e-a20f-dcc79fb94b6f,DISK]:
bad datanode DatanodeInfoWithStorage[192.168.184.73:50010,DS-9787b201-fc64-450e-a20f-dcc79fb94b6f,DISK]
...
2016-02-17 15:56:55,633 INFO  [SplitLogWorker-HOST7517:16020]
coordination.ZkSplitLogWorkerCoordination: worker
host7517.mydomain,16020,1455618296942 acquired task
/hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902.default.1455690304230
2016-02-17 15:56:55,677 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
wal.WALSplitter: Splitting wal:
hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230,
length=83
2016-02-17 15:56:55,677 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
wal.WALSplitter: DistributedLogReplay = false
2016-02-17 15:56:55,685 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
util.FSHDFSUtils: Recovering lease on dfs file
hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230
2016-02-17 15:56:55,696 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
util.FSHDFSUtils: recoverLease=false, attempt=0 on
file=hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230
after 11ms
2016-02-17 15:56:59,698 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
util.FSHDFSUtils: recoverLease=true, attempt=1 on
file=hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020
2016-02-17 15:56:59,771 INFO
[RS_LOG_REPLAY_OPS-HOST7517:16020-0-Writer-1] wal.WALSplitter:
Creating writer
path=hdfs://hdpts/apps/hbase/data/data/default/usertable/64199c31957c01b5bd9ee50b02e1f7fd/recovered.edits/0000000000000542072.temp
region=64199c31957c01b5bd9ee50b02e1f7fd
...(similar lines)
2016-02-17 15:57:01,001 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
wal.WALSplitter: Split writers finished
2016-02-17 15:57:01,034 INFO  [split-log-closeStream-3]
wal.WALSplitter: Rename
hdfs://hdpts/apps/hbase/data/data/default/usertable/06ed7277b7b9539a3ba597e0041acb12/recovered.edits/0000000000000054100.temp
to hdfs://hdpts/apps/hbase/data/data/default/usertable/06ed7277b7b9539a3ba597e0041acb12/recovered.edits/0000000000000055498
...(similar lines)
2016-02-17 15:57:01,332 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
wal.WALSplitter: Processed 73326 edits across 53 regions; edits
skipped=0; log file=hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230,
length=83, corrupted=false, progress failed=false
2016-02-17 15:57:01,336 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
coordination.ZkSplitLogWorkerCoordination: successfully transitioned
task /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902.default.1455690304230
to final state DONE host7517.mydomain,16020,1455618296942
2016-02-17 15:57:01,336 INFO  [RS_LOG_REPLAY_OPS-HOST7517:16020-0]
handler.WALSplitterHandler: worker
host7517.mydomain,16020,1455618296942 done with task
org.apache.hadoop.hbase.coordination.ZkSplitLogWorkerCoordination$ZkSplitTaskDetails@7b0d8dc7
in 5691ms

--- host7518

2016-02-17 15:40:55,957 WARN  [ResponseProcessor for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753896_13177]
hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753896_13177
java.io.IOException: Bad response ERROR for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753896_13177 from
datanode DatanodeInfoWithStorage[192.168.184.73:50010,DS-9787b201-fc64-450e-a20f-dcc79fb94b6f,DISK]
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:786)
2016-02-17 15:40:55,959 WARN  [DataStreamer for file
/apps/hbase/data/WALs/host7518.mydomain,16020,1455618298475/host7518.mydomain%2C16020%2C1455618298475.default.1455691227152
block BP-1843495860-192.168.189.219-1453778090403:blk_1073753896_13177]
hdfs.DFSClient: Error Recovery for block
BP-1843495860-192.168.189.219-1453778090403:blk_1073753896_13177 in
pipeline DatanodeInfoWithStorage[192.168.184.85:50010,DS-f8c4c3b2-a7cc-40e0-9ebb-2b9c352c0d28,DISK],
DatanodeInfoWithStorage[192.168.184.73:50010,DS-9787b201-fc64-450e-a20f-dcc79fb94b6f,DISK],
DatanodeInfoWithStorage[192.168.185.249:50010,DS-52f00167-e863-48d8-8f98-3b613f774b0c,DISK]:
bad datanode DatanodeInfoWithStorage[192.168.184.73:50010,DS-9787b201-fc64-450e-a20f-dcc79fb94b6f,DISK]
2016-02-17 15:40:56,043 INFO  [sync.4] wal.FSHLog: Slow sync cost:
3774 ms, current pipeline:
[DatanodeInfoWithStorage[192.168.184.85:50010,DS-f8c4c3b2-a7cc-40e0-9ebb-2b9c352c0d28,DISK],
DatanodeInfoWithStorage[192.168.185.249:50010,DS-52f00167-e863-48d8-8f98-3b613f774b0c,DISK],
DatanodeInfoWithStorage[192.168.189.208:50010,DS-882dca14-8651-4ffe-a102-8beebc774575,DISK]]
2016-02-17 15:40:56,044 INFO  [sync.3] wal.FSHLog: Slow sync cost:
20103 ms, current pipeline:
[DatanodeInfoWithStorage[192.168.184.85:50010,DS-f8c4c3b2-a7cc-40e0-9ebb-2b9c352c0d28,DISK],
DatanodeInfoWithStorage[192.168.185.249:50010,DS-52f00167-e863-48d8-8f98-3b613f774b0c,DISK],
DatanodeInfoWithStorage[192.168.189.208:50010,DS-882dca14-8651-4ffe-a102-8beebc774575,DISK]]
...
2016-02-17 15:41:16,016 INFO  [SplitLogWorker-HOST7518:16020]
coordination.ZkSplitLogWorkerCoordination: worker
host7518.mydomain,16020,1455618298475 acquired task
/hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902..meta.1455690873222.meta
2016-02-17 15:41:16,194 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
wal.WALSplitter: Splitting wal:
hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta,
length=83
2016-02-17 15:41:16,194 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
wal.WALSplitter: DistributedLogReplay = false
2016-02-17 15:41:16,252 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
util.FSHDFSUtils: Recovering lease on dfs file
hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
2016-02-17 15:41:16,396 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
util.FSHDFSUtils: recoverLease=false, attempt=0 on
file=hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
after 144ms
2016-02-17 15:41:20,398 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
util.FSHDFSUtils: recoverLease=true, attempt=1 on
file=hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
after 4146ms
2016-02-17 15:41:20,456 INFO
[RS_LOG_REPLAY_OPS-HOST7518:16020-0-Writer-2] wal.WALSplitter:
Creating writer
path=hdfs://hdpts/apps/hbase/data/data/hbase/meta/1588230740/recovered.edits/0000000000000009659.temp
region=1588230740
2016-02-17 15:41:20,457 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
wal.WALSplitter: Split writers finished
2016-02-17 15:41:20,479 INFO  [split-log-closeStream-1]
wal.WALSplitter: Rename
hdfs://hdpts/apps/hbase/data/data/hbase/meta/1588230740/recovered.edits/0000000000000009659.temp
to hdfs://hdpts/apps/hbase/data/data/hbase/meta/1588230740/recovered.edits/0000000000000009661
2016-02-17 15:41:20,479 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
wal.WALSplitter: Processed 3 edits across 1 regions; edits
skipped=148; log
file=hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta,
length=83, corrupted=false, progress failed=false
2016-02-17 15:41:20,482 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
coordination.ZkSplitLogWorkerCoordination: successfully transitioned
task /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902..meta.1455690873222.meta
to final state DONE host7518.mydomain,16020,1455618298475
2016-02-17 15:41:20,482 INFO  [RS_LOG_REPLAY_OPS-HOST7518:16020-0]
handler.WALSplitterHandler: worker
host7518.mydomain,16020,1455618298475 done with task
org.apache.hadoop.hbase.coordination.ZkSplitLogWorkerCoordination$ZkSplitTaskDetails@42ed43fe
in 4359ms

Regards,
Hironori Ogibayashi

2016-02-18 5:12 GMT+09:00 Samir Ahmic <ahmic.samir@gmail.com>:
> Hi Hironori,
>
> Based on your logs you have shutdown regionserver that was hosting system
> "hbase:meta" table. This is special case since user regions can not be
> reassigned to other servers until "hbase:meta" table region is reassigned.
> This can partially explain delay in recovery time you have experienced.
> Also there is WAL splitting process which must be completed before regions
> are reassigned. Still 15 minutes is too long period for recovery.
> How did you shutdown host7516? Also you should check logs for errors
> from host7517 and host7518 around this time 15:41 because they have
> acquired WAL splitting tasks.
> Also can you reproduce this every time in your tests ?
>
> Regards
> Samir
>
> On Wed, Feb 17, 2016 at 8:49 AM, おぎばやしひろのり <ogibayashi@gmail.com>
wrote:
>
>> Hello,
>>
>> I am testing how HBase works in case of failure and experienced
>> unexpected behavior. If someone could tell me how to shorten recovery
>> time, that would be appreciated.
>>
>> What I tried was, first client start writing to HBase, and then, shut
>> down one of the slave node machines. I expected the writing operation
>> stops for a while (1 minute or so) and after HBase automatically moves
>> failed regions to another slave node, client will be able to write
>> again.
>> But actually, it took about 15 minutes for the cluster to recover.
>> What was going on? I already have applied configurations presented on
>> https://hbase.apache.org/book.html#mttr
>>
>> My environment is:
>> - 3 master(NN, HMaster, ZK, etc.) nodes and 4 slave nodes(DN, RS), all
>> of them are virtual machines.
>> - HDP 2.4.3 (HBase 1.1.1)
>> - CentOS 6.7
>>
>> Here are HMaster and NameNode log. HMaster detected failure of a
>> regionserver at 15:41:16 and trying to start recovery, but throwing
>> RetriesExhaustedException again and again, then, finally finished
>> recoverying at 15:57:11. I noticed that there are some logs about
>> lease recovery in NameNode log, but not sure this is related.
>>
>> ---- HMaster log ---
>> 2016-02-17 15:41:16,003 INFO  [main-EventThread]
>> zookeeper.RegionServerTracker: RegionServer ephemeral node deleted,
>> processing expiration [host7516.mydomain,16020,1455618299902]
>> 2016-02-17 15:41:16,005 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> handler.MetaServerShutdownHandler: Splitting hbase:meta logs for
>> host7516.mydomain,16020,1455618299902
>> 2016-02-17 15:41:16,010 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> master.SplitLogManager: dead splitlog workers
>> [host7516.mydomain,16020,1455618299902]
>> 2016-02-17 15:41:16,011 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> master.SplitLogManager: started splitting 1 logs in
>>
>> [hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting]
>> for [host7516.
>> mydomain,16020,1455618299902]
>> 2016-02-17 15:41:16,016 INFO  [main-EventThread]
>> coordination.SplitLogManagerCoordination: task
>>
>> /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902..met
>> a.1455690873222.meta acquired by host7518.mydomain,16020,1455618298475
>> 2016-02-17 15:41:16,266 INFO
>> [host7512.mydomain,16000,1455618257882_splitLogManager__ChoreService_1]
>> master.SplitLogManager: total tasks = 1 unassigned = 0
>> tasks={/hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C14556182
>>
>> 99902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902..meta.1455690873222.meta=last_update
>> = 1455691276196 last_version = 2 cur_worker_name =
>> host7518.mydomain,16020,1455618298475 status = in_progress incarnation
>> = 0 resub
>> mits = 0 batch = installed = 1 done = 0 error = 0}
>> 2016-02-17 15:41:20,481 INFO  [main-EventThread]
>> coordination.SplitLogManagerCoordination: task
>>
>> /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902..met
>> a.1455690873222.meta entered state: DONE
>> host7518.mydomain,16020,1455618298475
>> 2016-02-17 15:41:20,487 INFO  [main-EventThread] wal.WALSplitter:
>> Archived processed log
>>
>> hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
>> to
>> hdfs://hdpts/apps/hbase/data/oldWALs/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
>> 2016-02-17 15:41:20,487 INFO  [main-EventThread]
>> coordination.SplitLogManagerCoordination: Done splitting
>>
>> /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902..meta.1455690873222.meta
>> 2016-02-17 15:41:20,492 WARN
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> master.SplitLogManager: returning success without actually splitting
>> and deleting all the log files in path
>>
>> hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting
>> 2016-02-17 15:41:20,492 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> master.SplitLogManager: finished splitting (more than or equal to) 83
>> bytes in 1 log files in
>>
>> [hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting]
>> in 4481ms
>> 2016-02-17 15:41:20,492 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> handler.MetaServerShutdownHandler: Server
>> host7516.mydomain,16020,1455618299902 was carrying META. Trying to
>> assign.
>> 2016-02-17 15:41:20,492 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1] master.RegionStates:
>> Transition {1588230740 state=OPEN, ts=1455690873330,
>> server=host7516.mydomain,16020,1455618299902} to {1588230740
>> state=OFFLINE, ts=1455691280492,
>> server=host7516.mydomain,16020,1455618299902}
>> 2016-02-17 15:41:20,493 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1] master.RegionStates:
>> Offlined 1588230740 from host7516.mydomain,16020,1455618299902
>> 2016-02-17 15:41:27,495 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> zookeeper.MetaTableLocator: Failed verification of hbase:meta,,1 at
>> address=host7516.mydomain,16020,1455618299902, exception=Connection
>> timed out
>> 2016-02-17 15:41:27,495 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> zookeeper.MetaTableLocator: Deleting hbase:meta region location in
>> ZooKeeper
>> 2016-02-17 15:41:27,500 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> master.AssignmentManager: Setting node as OFFLINED in ZooKeeper for
>> region {ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY =>
>> '', ENDKEY => ''}
>> 2016-02-17 15:41:27,502 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1]
>> master.AssignmentManager: Assigning hbase:meta,,1.1588230740 to
>> host7517.mydomain,16020,1455618296942
>> 2016-02-17 15:41:27,502 INFO
>> [MASTER_META_SERVER_OPERATIONS-HOST7512:16000-1] master.RegionStates:
>> Transition {1588230740 state=OFFLINE, ts=1455691287500,
>> server=host7516.mydomain,16020,1455618299902} to {1588230740
>> state=PENDING_OPEN, ts=1455691287502,
>> server=host7517.mydomain,16020,1455618296942}
>> 2016-02-17 15:41:27,512 INFO  [AM.ZK.Worker-pool2-t62]
>> master.RegionStates: Transition {1588230740 state=PENDING_OPEN,
>> ts=1455691287502, server=host7517.mydomain,16020,1455618296942} to
>> {1588230740 state=OPENING, ts=1455691287512,
>> server=host7517.mydomain,16020,1455618296942}
>> 2016-02-17 15:41:27,598 INFO  [AM.ZK.Worker-pool2-t63]
>> master.RegionStates: Transition {1588230740 state=OPENING,
>> ts=1455691287512, server=host7517.mydomain,16020,1455618296942} to
>> {1588230740 state=OPEN, ts=1455691287598,
>> server=host7517.mydomain,16020,1455618296942}
>> 2016-02-17 15:41:27,598 INFO  [AM.ZK.Worker-pool2-t63]
>> coordination.ZkOpenRegionCoordination: Handling OPENED of 1588230740
>> from host7512.mydomain,16000,1455618257882; deleting unassigned node
>> 2016-02-17 15:42:57,601 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4]
>> handler.ServerShutdownHandler: Received exception accessing hbase:meta
>> during server shutdown of host7516.mydomain,16020,1455618299902,
>> retrying hbase:meta read
>> org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after
>> attempts=351, exceptions:
>> Wed Feb 17 15:42:57 KST 2016, null, java.net.SocketTimeoutException:
>> callTimeout=60000, callDuration=90304: row '' on table 'hbase:meta' at
>> region=hbase:meta,,1.1588230740,
>> hostname=host7516.mydomain,16020,1455618299902, seqNum=0
>>
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:271)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:195)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295)
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160)
>>         at
>> org.apache.hadoop.hbase.client.ClientScanner.<init>(ClientScanner.java:155)
>>         at
>> org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821)
>>         at
>> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:602)
>>         at
>> org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:156)
>>         at
>> org.apache.hadoop.hbase.MetaTableAccessor.getServerUserRegions(MetaTableAccessor.java:555)
>>         at
>> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:177)
>>         at
>> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>         at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.net.SocketTimeoutException: callTimeout=60000,
>> callDuration=90304: row '' on table 'hbase:meta' at
>> region=hbase:meta,,1.1588230740,
>> hostname=host7516.mydomain,16020,1455618299902, seqNum=0
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
>>         at
>> org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:64)
>>         ... 3 more
>> Caused by: java.io.IOException: Call to
>> HOST7516.mydomain/192.168.184.73:16020 failed on local exception:
>> org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=1388,
>> waitTime=90001, operationTimeout=90000 expired.
>>         at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl.wrapException(RpcClientImpl.java:1262)
>>         at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1230)
>>         at
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213)
>>         at
>> org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287)
>>         at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:32651)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.openScanner(ScannerCallable.java:372)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:199)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:62)
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:346)
>>         at
>> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:320)
>>         at
>> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
>>         ... 4 more
>> Caused by: org.apache.hadoop.hbase.ipc.CallTimeoutException: Call
>> id=1388, waitTime=90001, operationTimeout=90000 expired.
>>         at
>> org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:70)
>>         at
>> org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1204)
>>         ... 14 more
>> ...(Same error happens again and again)
>> 2016-02-17 15:56:55,625 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4]
>> handler.ServerShutdownHandler: Splitting logs for
>> host7516.mydomain,16020,1455618299902 before assignment; region
>> count=73
>> 2016-02-17 15:56:55,627 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4] master.SplitLogManager:
>> dead splitlog workers [host7516.mydomain,16020,1455618299902]
>> 2016-02-17 15:56:55,628 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4] master.SplitLogManager:
>> started splitting 1 logs in
>>
>> [hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting]
>> for [host7516.line.
>> ism,16020,1455618299902]
>> 2016-02-17 15:56:55,634 INFO  [main-EventThread]
>> coordination.SplitLogManagerCoordination: task
>>
>> /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902.defa
>> ult.1455690304230 acquired by host7517.mydomain,16020,1455618296942
>> 2016-02-17 15:56:56,266 INFO
>> [host7512.mydomain,16000,1455618257882_splitLogManager__ChoreService_1]
>> master.SplitLogManager: total tasks = 1 unassigned = 0
>> tasks={/hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C14556182
>>
>> 99902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902.default.1455690304230=last_update
>> = 1455692215679 last_version = 2 cur_worker_name =
>> host7517.mydomain,16020,1455618296942 status = in_progress incarnation
>> = 0 resubmit
>> s = 0 batch = installed = 1 done = 0 error = 0}
>> 2016-02-17 15:57:01,336 INFO  [main-EventThread]
>> coordination.SplitLogManagerCoordination: task
>>
>> /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C1455618299902.defa
>> ult.1455690304230 entered state: DONE host7517.mydomain,16020,1455618296942
>> 2016-02-17 15:57:01,343 INFO  [main-EventThread] wal.WALSplitter:
>> Archived processed log
>>
>> hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.14556903042
>> 30 to
>> hdfs://hdpts/apps/hbase/data/oldWALs/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230
>> 2016-02-17 15:57:01,343 INFO  [main-EventThread]
>> coordination.SplitLogManagerCoordination: Done splitting
>>
>> /hbase-unsecure/splitWAL/WALs%2Fhost7516.mydomain%2C16020%2C1455618299902-splitting%2Fhost7516.mydomain%252C16020%252C14556182
>> 99902.default.1455690304230
>> 2016-02-17 15:57:01,346 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4] master.SplitLogManager:
>> finished splitting (more than or equal to) 83 bytes in 1 log files in
>> [hdfs://hdpts/apps/hbase/data/WALs/host7516.mydomain,16020,14556
>> 18299902-splitting] in 5718ms
>> ...
>> 2016-02-17 15:57:01,353 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4] master.AssignmentManager:
>> Bulk assigning 73 region(s) across 3 server(s), round-robin=true
>> 2016-02-17 15:57:01,353 INFO
>> [host7512.mydomain,16000,1455618257882-GeneralBulkAssigner-0]
>> master.AssignmentManager: Assigning 24 region(s) to
>> host7515.mydomain,16020,1455691124033
>> 2016-02-17 15:57:01,353 INFO
>> [host7512.mydomain,16000,1455618257882-GeneralBulkAssigner-1]
>> master.AssignmentManager: Assigning 24 region(s) to
>> host7517.mydomain,16020,1455618296942
>> 2016-02-17 15:57:01,353 INFO
>> [host7512.mydomain,16000,1455618257882-GeneralBulkAssigner-2]
>> master.AssignmentManager: Assigning 25 region(s) to
>> host7518.mydomain,16020,1455618298475
>> ...
>> 2016-02-17 15:57:11,005 INFO  [AM.ZK.Worker-pool2-t71]
>> master.RegionStates: Offlined 50cc594cd9e74287ddd5b960bfa2d8e9 from
>> host7516.mydomain,16020,1455618299902
>> 2016-02-17 15:57:11,005 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4] master.AssignmentManager:
>> Bulk assigning done
>> 2016-02-17 15:57:11,005 INFO
>> [MASTER_SERVER_OPERATIONS-HOST7512:16000-4]
>> handler.ServerShutdownHandler: Finished processing of shutdown of
>> host7516.mydomain,16020,1455618299902
>> ----
>>
>> ---- NameNode log ---
>> 2016-02-17 15:41:16,008 INFO  namenode.FSEditLog
>> (FSEditLog.java:printStatistics(699)) - Number of transactions: 45
>> Total time for transactions(ms): 4 Number of transactions batched in
>> Syncs: 0 Number of syncs: 42 SyncTimes(ms): 35 11
>> 2016-02-17 15:41:16,383 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:recoverLeaseInternal(2907)) - recoverLease: [Lease.
>> Holder: DFSClient_NONMAPREDUCE_1038296970_1, pendingcreates: 2],
>>
>> src=/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
>> from client DFSClient_NONMAPREDUCE_1038296970_1
>> 2016-02-17 15:41:16,383 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:internalReleaseLease(4036)) - Recovering [Lease.
>> Holder: DFSClient_NONMAPREDUCE_1038296970_1, pendingcreates: 2],
>>
>> src=/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
>> 2016-02-17 15:41:16,383 WARN  hdfs.StateChange
>> (FSNamesystem.java:internalReleaseLease(4157)) - DIR*
>> NameSystem.internalReleaseLease: File
>>
>> /apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta
>> has not been closed. Lease recovery is in progress. RecoveryId = 13180
>> for block blk_1073753851_13131{UCState=UNDER_RECOVERY,
>> truncateBlock=null, primaryNodeIndex=2,
>>
>> replicas=[ReplicaUC[[DISK]DS-9787b201-fc64-450e-a20f-dcc79fb94b6f:NORMAL:192.168.184.73:50010
>> |RBW],
>>
>> ReplicaUC[[DISK]DS-52f00167-e863-48d8-8f98-3b613f774b0c:NORMAL:192.168.185.249:50010
>> |RBW],
>>
>> ReplicaUC[[DISK]DS-f8c4c3b2-a7cc-40e0-9ebb-2b9c352c0d28:NORMAL:192.168.184.85:50010
>> |RBW]]}
>> 2016-02-17 15:41:18,588 INFO  BlockStateChange
>> (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) -
>> chooseUnderReplicatedBlocks selected 8 blocks at priority level 2;
>> Total=8 Reset bookmarks? false
>> 2016-02-17 15:41:18,588 INFO  BlockStateChange
>> (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK*
>> neededReplications = 75, pendingReplications = 0.
>> 2016-02-17 15:41:18,588 INFO  blockmanagement.BlockManager
>> (BlockManager.java:computeReplicationWorkForBlocks(1534)) - Blocks
>> chosen but could not be replicated = 8; of which 8 have no target, 0
>> have no source, 0 are UC, 0 are abandoned, 0 already have enough
>> replicas.
>> 2016-02-17 15:41:18,960 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,961 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,962 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,963 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,964 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,966 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,967 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,967 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:18,969 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:19,096 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:19,098 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 2016-02-17 15:41:19,099 WARN  security.UserGroupInformation
>> (UserGroupInformation.java:getGroupNames(1521)) - No groups available
>> for user anonymous
>> 016-02-17 15:41:19,307 INFO  blockmanagement.DatanodeManager
>> (DatanodeManager.java:handleHeartbeat(1426)) - Skipped stale nodes for
>> recovery : 1
>> 2016-02-17 15:41:19,694 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:commitBlockSynchronization(4269)) -
>>
>> commitBlockSynchronization(oldBlock=BP-1843495860-192.168.189.219-1453778090403:blk_1073753851_13131,
>> newgenerationstamp=13180, newl
>> ength=60238, newtargets=[192.168.185.249:50010, 192.168.184.85:50010],
>> closeFile=true, deleteBlock=false)
>> 2016-02-17 15:41:19,696 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:commitBlockSynchronization(4434)) -
>>
>> commitBlockSynchronization(oldBlock=BP-1843495860-192.168.189.219-1453778090403:blk_1073753851_13131,
>> file=/apps/hbase/data/WALs/cch
>>
>> dpts7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902..meta.1455690873222.meta,
>> newgenerationstamp=13180, newlength=60238,
>> newtargets=[192.168.185.249:50010, 192.168.184.85:50010]) successful
>> 2016-02-17 15:41:20,459 INFO  hdfs.StateChange
>> (FSNamesystem.java:saveAllocatedBlock(3583)) - BLOCK* allocate
>> blk_1073753897_13181{UCState=UNDER_CONSTRUCTION, truncateBlock=null,
>> primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-f8c4c3b2-
>> a7cc-40e0-9ebb-2b9c352c0d28:NORMAL:192.168.184.85:50010|RBW],
>>
>> ReplicaUC[[DISK]DS-882dca14-8651-4ffe-a102-8beebc774575:NORMAL:192.168.189.208:50010
>> |RBW],
>>
>> ReplicaUC[[DISK]DS-52f00167-e863-48d8-8f98-3b613f774b0c:NORMAL:192.168.185.249:50010
>> |RBW
>> ]]} for
>> /apps/hbase/data/data/hbase/meta/1588230740/recovered.edits/0000000000000009659.temp
>> 2016-02-17 15:41:20,474 INFO  hdfs.StateChange
>> (FSNamesystem.java:completeFile(3503)) - DIR* completeFile:
>>
>> /apps/hbase/data/data/hbase/meta/1588230740/recovered.edits/0000000000000009659.temp
>> is closed by DFSClient_NONMAPREDUCE_443617401_
>> 1
>> 2016-02-17 15:41:20,488 INFO  ipc.Server (Server.java:run(2172)) - IPC
>> Server handler 8 on 8020, call
>> org.apache.hadoop.hdfs.protocol.ClientProtocol.delete from
>> 192.168.189.219:51393 Call#8040 Retry#0
>> org.apache.hadoop.fs.PathIsNotEmptyDirectoryException:
>> `/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting
>> is non empty': Directory is not empty
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSDirDeleteOp.delete(FSDirDeleteOp.java:89)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:3723)
>>         at
>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.delete(NameNodeRpcServer.java:1017)
>>         at
>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.delete(ClientNamenodeProtocolServerSideTranslatorPB.java:611)
>>         at
>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>>         at
>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at javax.security.auth.Subject.doAs(Subject.java:422)
>>         at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
>>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)
>> 2016-02-17 15:41:21,588 INFO  BlockStateChange
>> (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) -
>> chooseUnderReplicatedBlocks selected 8 blocks at priority level 2;
>> Total=8 Reset bookmarks? false
>> 2016-02-17 15:41:21,588 INFO  BlockStateChange
>> (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK*
>> neededReplications = 76, pendingReplications = 0.
>> ...
>> 2016-02-17 15:56:15,384 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:rollEditLog(5937)) - Roll Edit Log from
>> 192.168.184.72
>> 2016-02-17 15:56:15,384 INFO  namenode.FSEditLog
>> (FSEditLog.java:rollEditLog(1202)) - Rolling edit logs
>> 2016-02-17 15:56:15,384 INFO  namenode.FSEditLog
>> (FSEditLog.java:endCurrentLogSegment(1258)) - Ending log segment
>> 369158
>> 2016-02-17 15:56:15,395 INFO  namenode.FSEditLog
>> (FSEditLog.java:printStatistics(699)) - Number of transactions: 14
>> Total time for transactions(ms): 5 Number of transactions batched in
>> Syncs: 0 Number of syncs: 14 SyncTimes(ms): 26 6
>> 2016-02-17 15:56:15,421 INFO  namenode.FileJournalManager
>> (FileJournalManager.java:finalizeLogSegment(133)) - Finalizing edits
>> file /hadoop/hdfs/namenode/current/edits_inprogress_0000000000000369158
>> -> /hadoop/hdfs/namenode/current/edits_
>> 0000000000000369158-0000000000000369171
>> 2016-02-17 15:56:15,421 INFO  namenode.FSEditLog
>> (FSEditLog.java:startLogSegment(1218)) - Starting log segment at
>> 369172
>> 2016-02-17 15:56:15,764 INFO  BlockStateChange
>> (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) -
>> chooseUnderReplicatedBlocks selected 3 blocks at priority level 1; 3
>> blocks at priority level 2;  Total=6 Reset bookmarks? fals
>> e
>> 2016-02-17 15:56:15,764 INFO  BlockStateChange
>> (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK*
>> neededReplications = 91, pendingReplications = 4.
>> 2016-02-17 15:56:15,764 INFO  blockmanagement.BlockManager
>> (BlockManager.java:computeReplicationWorkForBlocks(1534)) - Blocks
>> chosen but could not be replicated = 3; of which 3 have no target, 0
>> have no source, 0 are UC, 0 are abandoned,
>> 0 already have enough replicas.
>> 2016-02-17 15:56:18,764 INFO  BlockStateChange
>> (UnderReplicatedBlocks.java:chooseUnderReplicatedBlocks(394)) -
>> chooseUnderReplicatedBlocks selected 6 blocks at priority level 2;
>> Total=6 Reset bookmarks? false
>> 2016-02-17 15:56:18,765 INFO  BlockStateChange
>> (BlockManager.java:computeReplicationWorkForBlocks(1527)) - BLOCK*
>> neededReplications = 87, pendingReplications = 5.
>> 2016-02-17 15:56:18,765 INFO  blockmanagement.BlockManager
>> (BlockManager.java:computeReplicationWorkForBlocks(1534)) - Blocks
>> chosen but could not be replicated = 2; of which 0 have no target, 2
>> have no source, 0 are UC, 0 are abandoned,
>> 0 already have enough replicas.
>> ...
>> 2016-02-17 15:56:55,688 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:recoverLeaseInternal(2907)) - recoverLease: [Lease.
>> Holder: DFSClient_NONMAPREDUCE_1038296970_1, pendingcreates: 1],
>> src=/apps/hbase/data/WALs/host7516.mydomain,16
>>
>> 020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230
>> from client DFSClient_NONMAPREDUCE_1038296970_1
>> 2016-02-17 15:56:55,688 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:internalReleaseLease(4036)) - Recovering [Lease.
>> Holder: DFSClient_NONMAPREDUCE_1038296970_1, pendingcreates: 1],
>> src=/apps/hbase/data/WALs/host7516.mydomain,16020
>>
>> ,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230
>> 2016-02-17 15:56:55,688 WARN  hdfs.StateChange
>> (FSNamesystem.java:internalReleaseLease(4157)) - DIR*
>> NameSystem.internalReleaseLease: File
>>
>> /apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C160
>> 20%2C1455618299902.default.1455690304230 has not been closed. Lease
>> recovery is in progress. RecoveryId = 13183 for block
>> blk_1073753849_13127{UCState=UNDER_RECOVERY, truncateBlock=null,
>> primaryNodeIndex=2, replicas=[ReplicaUC[[DISK]DS-97
>> 87b201-fc64-450e-a20f-dcc79fb94b6f:NORMAL:192.168.184.73:50010|RBW],
>>
>> ReplicaUC[[DISK]DS-f8c4c3b2-a7cc-40e0-9ebb-2b9c352c0d28:NORMAL:192.168.184.85:50010
>> |RBW],
>> ReplicaUC[[DISK]DS-52f00167-e863-48d8-8f98-3b613f774b0c:NORMAL:
>> 192.168.185.249:500
>> 10|RBW]]}
>> 2016-02-17 15:56:58,687 INFO  namenode.FSNamesystem
>> (FSNamesystem.java:commitBlockSynchronization(4434)) -
>>
>> commitBlockSynchronization(oldBlock=BP-1843495860-192.168.189.219-1453778090403:blk_1073753849_13127,
>>
>> file=/apps/hbase/data/WALs/host7516.mydomain,16020,1455618299902-splitting/host7516.mydomain%2C16020%2C1455618299902.default.1455690304230,
>> newgenerationstamp=13183, newlength=118884436,
>> newtargets=[192.168.184.85:50010, 192.168.185.249:50010]) successful
>>
>> ----
>>
>> Thanks,
>> Hironori Ogibayashi
>>


Mime
View raw message