hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 俞忠静 <yuzhongj...@bianfeng.com>
Subject HRegionServer Aborted when map-reduce job is running
Date Mon, 16 Nov 2015 08:53:42 GMT
Hi Dear All

HRegionServer Aborted when map-reduce job is running. It seems there is heartbeat between
HRegionServer and Hadoop datanode, and datanode failed to response to HRegionServer because
it is busy in MR job. logs are as follow:

2015-11-15 03:15:59,025 WARN  [B.defaultRpcServer.handler=21,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":10543,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.7:47704","starttimems":1447528548482,"queuetimems":1,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:00,081 WARN  [B.defaultRpcServer.handler=27,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":11177,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51754","starttimems":1447528548904,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:03,375 INFO  [sync.3] wal.FSHLog: Slow sync cost: 209 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:08,842 INFO  [sync.2] wal.FSHLog: Slow sync cost: 1337 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:08,853 WARN  [B.defaultRpcServer.handler=16,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":14775,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51846","starttimems":1447528554078,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:10,590 WARN  [B.defaultRpcServer.handler=14,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":10906,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37598","starttimems":1447528559684,"queuetimems":1064,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:12,734 WARN  [B.defaultRpcServer.handler=5,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":11674,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:52792","starttimems":1447528561059,"queuetimems":1521,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:12,886 WARN  [B.defaultRpcServer.handler=26,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":10379,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51754","starttimems":1447528562507,"queuetimems":2112,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:15,248 WARN  [B.defaultRpcServer.handler=9,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":26353,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51733","starttimems":1447528548895,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:16,604 WARN  [B.defaultRpcServer.handler=8,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":29483,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45920","starttimems":1447528547121,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:18,859 INFO  [sync.4] wal.FSHLog: Slow sync cost: 210 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:20,283 WARN  [B.defaultRpcServer.handler=11,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":24416,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37562","starttimems":1447528555866,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:21,107 WARN  [B.defaultRpcServer.handler=12,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":18328,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51754","starttimems":1447528562779,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:21,263 WARN  [B.defaultRpcServer.handler=21,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":18264,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37585","starttimems":1447528562999,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:23,295 WARN  [B.defaultRpcServer.handler=18,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":14154,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37585","starttimems":1447528569141,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:25,670 WARN  [B.defaultRpcServer.handler=29,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":37302,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45892","starttimems":1447528548368,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:26,079 INFO  [sync.2] wal.FSHLog: Slow sync cost: 208 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:26,567 WARN  [B.defaultRpcServer.handler=0,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":38756,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51937","starttimems":1447528547811,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:26,942 INFO  [sync.4] wal.FSHLog: Slow sync cost: 206 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:27,537 WARN  [B.defaultRpcServer.handler=5,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":13778,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51754","starttimems":1447528573759,"queuetimems":765,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:29,119 WARN  [B.defaultRpcServer.handler=27,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":12257,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37603","starttimems":1447528576861,"queuetimems":1213,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:29,503 WARN  [B.defaultRpcServer.handler=2,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":22488,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45898","starttimems":1447528567014,"queuetimems":1934,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:30,959 WARN  [B.defaultRpcServer.handler=26,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":11581,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45920","starttimems":1447528579378,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:31,175 WARN  [B.defaultRpcServer.handler=15,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":15606,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45920","starttimems":1447528575569,"queuetimems":803,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:36,607 INFO  [sync.4] wal.FSHLog: Slow sync cost: 263 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:36,607 INFO  [sync.3] wal.FSHLog: Slow sync cost: 269 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:36,822 INFO  [sync.0] wal.FSHLog: Slow sync cost: 210 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:38,570 WARN  [B.defaultRpcServer.handler=16,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":10945,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51754","starttimems":1447528587624,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:43,072 INFO  [sync.2] wal.FSHLog: Slow sync cost: 216 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:46,719 WARN  [B.defaultRpcServer.handler=20,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":15520,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45920","starttimems":1447528591199,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:48,303 WARN  [B.defaultRpcServer.handler=1,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":10417,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37585","starttimems":1447528597886,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:48,691 INFO  [sync.3] wal.FSHLog: Slow sync cost: 212 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:48,905 INFO  [sync.0] wal.FSHLog: Slow sync cost: 210 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:48,905 INFO  [sync.1] wal.FSHLog: Slow sync cost: 209 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:16:52,628 WARN  [B.defaultRpcServer.handler=3,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":33717,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45920","starttimems":1447528578911,"queuetimems":1904,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:16:52,847 WARN  [B.defaultRpcServer.handler=7,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":20241,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37585","starttimems":1447528592605,"queuetimems":329,"class":"HRegionServer","responsesize":2015,"method":"Scan"}


…

":14798,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37575","starttimems":1447528671427,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:06,940 INFO  [sync.4] wal.FSHLog: Slow sync cost: 208 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:18:07,002 INFO  [regionserver/data05/10.65.65.5:16020-shortCompactions-1447378680473]
regionserver.HRegion: Starting compaction on Fam in region YinziOut,1446048000_2_10_0_72148315_1,1446913516971.dcc835bb61506c91c62f6033ea9ae080.
2015-11-15 03:18:07,002 INFO  [regionserver/data05/10.65.65.5:16020-shortCompactions-1447378680473]
regionserver.HStore: Starting compaction of 4 file(s) in Fam of YinziOut,1446048000_2_10_0_72148315_1,1446913516971.dcc835bb61506c91c62f6033ea9ae080.
into tmpdir=hdfs://bfcluster/hbase/data/default/YinziOut/dcc835bb61506c91c62f6033ea9ae080/.tmp,
totalSize=1.3 G
2015-11-15 03:18:08,209 WARN  [B.defaultRpcServer.handler=2,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":13683,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45898","starttimems":1447528674525,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:08,463 WARN  [B.defaultRpcServer.handler=22,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":13007,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51763","starttimems":1447528675456,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:09,344 WARN  [B.defaultRpcServer.handler=19,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":11697,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45898","starttimems":1447528677647,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:09,406 INFO  [regionserver/data05/10.65.65.5:16020-shortCompactions-1447378680473]
hfile.CacheConfig: blockCache=LruBlockCache{blockCount=877, currentSize=97321864, freeSize=3303760760,
maxSize=3401082624, heapSize=97321864, minSize=3231028480, minFactor=0.95, multiSize=1615514240,
multiFactor=0.5, singleSize=807757120, singleFactor=0.25}, cacheDataOnRead=true, cacheDataOnWrite=false,
cacheIndexesOnWrite=false, cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false,
prefetchOnOpen=false
2015-11-15 03:18:09,672 INFO  [sync.3] wal.FSHLog: Slow sync cost: 216 ms, current pipeline:
[10.65.65.5:50010, 10.65.65.4:50010]
2015-11-15 03:18:12,661 WARN  [B.defaultRpcServer.handler=13,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":12451,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51733","starttimems":1447528680210,"queuetimems":350,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:14,979 WARN  [B.defaultRpcServer.handler=16,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":15825,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.7:53250","starttimems":1447528679154,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:15,231 WARN  [B.defaultRpcServer.handler=25,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":13643,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.4:37585","starttimems":1447528681587,"queuetimems":217,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:17,078 WARN  [B.defaultRpcServer.handler=4,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":15736,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51763","starttimems":1447528681342,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:17,445 WARN  [B.defaultRpcServer.handler=12,queue=0,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":11143,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.7:47703","starttimems":1447528686302,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:17,769 WARN  [B.defaultRpcServer.handler=28,queue=1,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":11157,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.5:51827","starttimems":1447528686611,"queuetimems":360,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:18:18,638 WARN  [B.defaultRpcServer.handler=29,queue=2,port=16020] ipc.RpcServer:
(responseTooSlow): {"processingtimems":19233,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45920","starttimems":1447528679405,"queuetimems":0,"class":"HRegionServer","responsesize":2015,"method":"Scan"}



s":15210,"call":"Scan(org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ScanRequest)","client":"10.65.65.6:45898","starttimems":1447528810377,"queuetimems":83,"class":"HRegionServer","responsesize":2015,"method":"Scan"}
2015-11-15 03:20:25,993 WARN  [ResponseProcessor for block BP-622459206-10.65.65.8-1430728217493:blk_1074387381_646803]
hdfs.DFSClient: DFSOutputStream ResponseProcessor exception  for block BP-622459206-10.65.65.8-1430728217493:blk_1074387381_646803
java.net.SocketTimeoutException: 65000 millis timeout while waiting for channel to be ready
for read. ch : java.nio.channels.SocketChannel[connected local=/10.65.65.5:47676 remote=/10.65.65.5:50010]
        at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
        at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
        at java.io.FilterInputStream.read(FilterInputStream.java:83)
        at java.io.FilterInputStream.read(FilterInputStream.java:83)
        at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1998)
        at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:176)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:798)
2015-11-15 03:20:25,995 WARN  [sync.3] hdfs.DFSClient: Error while syncing
java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:25,995 WARN  [sync.4] hdfs.DFSClient: Error while syncing
java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:25,995 WARN  [sync.0] hdfs.DFSClient: Error while syncing
java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:25,995 ERROR [sync.4] wal.FSHLog: Error syncing, request close of wal
java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:25,995 WARN  [sync.2] hdfs.DFSClient: Error while syncing
java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:25,996 ERROR [sync.2] wal.FSHLog: Error syncing, request close of wal

ta05,16020,1447378673058: Failed log close in log roller
org.apache.hadoop.hbase.regionserver.wal.FailedLogCloseException: hdfs://bfcluster/hbase/WALs/data05,16020,1447378673058/data05%2C16020%2C1447378673058.default.1447528256432,
unflushedEntries=7
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:988)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:721)
        at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:137)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hbase.regionserver.wal.FailedSyncBeforeLogCloseException: java.io.IOException:
All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog$SafePointZigZagLatch.waitSafePoint(FSHLog.java:1853)
        at org.apache.hadoop.hbase.regionserver.wal.FSHLog.replaceWriter(FSHLog.java:938)
        ... 3 more
Caused by: java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:26,561 FATAL [regionserver/data05/10.65.65.5:16020.logRoller] regionserver.HRegionServer:
RegionServer abort: loaded coprocessors are: [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint]
2015-11-15 03:20:26,685 INFO  [regionserver/data05/10.65.65.5:16020.logRoller] regionserver.HRegionServer:
Dump of metrics as JSON on abort: {
  "beans" : [ {
    "name" : "java.lang:type=Memory",
    "modelerType" : "sun.management.MemoryImpl",
    "Verbose" : false,
    "NonHeapMemoryUsage" : {
      "committed" : 144703488,
      "init" : 136773632,
      "max" : 184549376,
      "used" : 53630984
    },
    "ObjectPendingFinalizationCount" : 0,
    "HeapMemoryUsage" : {
      "committed" : 7054729216,
      "init" : 524243392,
      "max" : 8502706176,
      "used" : 4019416736
    },
    "ObjectName" : "java.lang:type=Memory"
  } ],
  "beans" : [ {
    "name" : "Hadoop:service=HBase,name=RegionServer,sub=IPC",
    "modelerType" : "RegionServer,sub=IPC",
    "tag.Context" : "regionserver",
    "tag.Hostname" : "data05",
    "queueSize" : 477,
    "numCallsInGeneralQueue" : 0,
    "numCallsInReplicationQueue" : 0,
    "numCallsInPriorityQueue" : 0,
    "numOpenConnections" : 33,
    "numActiveHandler" : 18,
    "TotalCallTime_num_ops" : 24427414,
"TotalCallTime_min" : 0,





    "Increment_num_ops" : 0,
    "Increment_min" : 0,
    "Increment_max" : 0,
    "Increment_mean" : 0.0,
    "Increment_median" : 0.0,
    "Increment_75th_percentile" : 0.0,
    "Increment_95th_percentile" : 0.0,
    "Increment_99th_percentile" : 0.0
  } ]
}
2015-11-15 03:20:26,720 INFO  [regionserver/data05/10.65.65.5:16020.logRoller] regionserver.HRegionServer:
STOPPED: Failed log close in log roller
2015-11-15 03:20:26,721 INFO  [regionserver/data05/10.65.65.5:16020.logRoller] regionserver.LogRoller:
LogRoller exiting.
2015-11-15 03:20:26,721 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.SplitLogWorker:
Sending interrupt to stop the worker thread
2015-11-15 03:20:26,722 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.HRegionServer:
Stopping infoServer
2015-11-15 03:20:26,722 INFO  [SplitLogWorker-data05:16020] regionserver.SplitLogWorker: SplitLogWorker
interrupted. Exiting.
2015-11-15 03:20:26,722 INFO  [SplitLogWorker-data05:16020] regionserver.SplitLogWorker: SplitLogWorker
data05,16020,1447378673058 exiting
2015-11-15 03:20:26,741 INFO  [regionserver/data05/10.65.65.5:16020] mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:16030
2015-11-15 03:20:26,844 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.HeapMemoryManager:
Stoping HeapMemoryTuner chore.
2015-11-15 03:20:26,845 INFO  [regionserver/data05/10.65.65.5:16020] snapshot.RegionServerSnapshotManager:
Stopping RegionServerSnapshotManager abruptly.
2015-11-15 03:20:26,845 INFO  [MemStoreFlusher.1] regionserver.MemStoreFlusher: MemStoreFlusher.1
exiting
2015-11-15 03:20:26,845 INFO  [MemStoreFlusher.0] regionserver.MemStoreFlusher: MemStoreFlusher.0
exiting
2015-11-15 03:20:26,850 INFO  [regionserver/data05/10.65.65.5:16020] flush.RegionServerFlushTableProcedureManager:
Stopping region server flush procedure manager abruptly.
2015-11-15 03:20:26,854 INFO  [StoreCloserThread-XyBusy,1445240282_1126166343,1445742317442.db8afd9ca723a8143b0430f1f46cfec6.-1]
regionserver.HStore: Closed Fam
2015-11-15 03:20:26,854 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.HRegionServer:
aborting server data05,16020,1447378673058
2015-11-15 03:20:26,854 INFO  [RS_CLOSE_REGION-data05:16020-1] regionserver.HRegion: Closed
XyBusy,1445240282_1126166343,1445742317442.db8afd9ca723a8143b0430f1f46cfec6.
2015-11-15 03:20:26,854 INFO  [regionserver/data05/10.65.65.5:16020] client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x350f723dbba027c
2015-11-15 03:20:26,855 INFO  [StoreCloserThread-BfCoin,,1446825952526.e1f4010e4b7c4c4154907a081c4be53f.-1]
regionserver.HStore: Closed Fam
2015-11-15 03:20:26,855 ERROR [RS_CLOSE_REGION-data05:16020-0] regionserver.HRegion: Memstore
size is 153464
2015-11-15 03:20:26,855 INFO  [RS_CLOSE_REGION-data05:16020-0] regionserver.HRegion: Closed
BfCoin,,1446825952526.e1f4010e4b7c4c4154907a081c4be53f.
2015-11-15 03:20:26,856 INFO  [StoreCloserThread-PropOut,,1446622866813.db6256960af9d91dee221fab98db872b.-1]
regionserver.HStore: Closed Fam
2015-11-15 03:20:26,856 ERROR [RS_CLOSE_REGION-data05:16020-0] regionserver.HRegion: Memstore
size is 233480
2015-11-15 03:20:26,856 INFO  [RS_CLOSE_REGION-data05:16020-0] regionserver.HRegion: Closed
PropOut,,1446622866813.db6256960af9d91de




2015-11-15 03:20:28,525 INFO  [RS_CLOSE_REGION-data05:16020-2] regionserver.HRegion: Closed
hbase:namespace,,1435633859565.1b08af8d431864e5b7c686f8c7fae3a4.
2015-11-15 03:20:28,606 INFO  [regionserver/data05/10.65.65.5:16020-shortCompactions-1447378680473]
regionserver.HRegion: compaction interrupted
java.io.InterruptedIOException: Aborting compaction of store Fam in region YinziOut,1446048000_2_10_0_72148315_1,1446913516971.dcc835bb61506c91c62f6033ea9ae080.
because it was interrupted.
        at org.apache.hadoop.hbase.regionserver.compactions.DefaultCompactor.compact(DefaultCompactor.java:112)
        at org.apache.hadoop.hbase.regionserver.DefaultStoreEngine$DefaultCompactionContext.compact(DefaultStoreEngine.java:112)
        at org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:1212)
        at org.apache.hadoop.hbase.regionserver.HRegion.compact(HRegion.java:1798)
        at org.apache.hadoop.hbase.regionserver.CompactSplitThread$CompactionRunner.run(CompactSplitThread.java:519)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015-11-15 03:20:28,607 INFO  [regionserver/data05/10.65.65.5:16020-shortCompactions-1447378680473]
regionserver.CompactSplitThread: Aborted compaction: Request = regionName=YinziOut,1446048000_2_10_0_72148315_1,1446913516971.dcc835bb61506c91c62f6033ea9ae080.,
storeName=Fam, fileCount=4, fileSize=1.3 G (1004.0 M, 270.2 M, 48.8 M, 20.8 M), priority=6,
time=17340344466571946; duration=2mins, 21sec
2015-11-15 03:20:28,607 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.CompactSplitThread:
Waiting for Small Compaction Thread to finish...
2015-11-15 03:20:28,607 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.HRegionServer:
Waiting on 7 regions to close
2015-11-15 03:20:28,608 INFO  [StoreCloserThread-YinziOut,1446048000_2_10_0_72148315_1,1446913516971.dcc835bb61506c91c62f6033ea9ae080.-1]
regionserver.HStore: Closed Fam
2015-11-15 03:20:28,608 INFO  [StoreCloserThread-hbase:meta,,1.1588230740-1] regionserver.HStore:
Closed info
2015-11-15 03:20:28,608 INFO  [RS_CLOSE_REGION-data05:16020-2] regionserver.HRegion: Closed
YinziOut,1446048000_2_10_0_72148315_1,1446913516971.dcc835bb61506c91c62f6033ea9ae080.
2015-11-15 03:20:28,609 INFO  [RS_CLOSE_META-data05:16020-0] regionserver.HRegion: Closed
hbase:meta,,1.1588230740
2015-11-15 03:20:28,609 INFO  [StoreCloserThread-XyBusy,,1445521466672.8540f2da4a89c8212d3504a5bd370092.-1]
regionserver.HStore: Closed Fam
2015-11-15 03:20:28,609 INFO  [RS_CLOSE_REGION-data05:16020-2] regionserver.HRegion: Closed
XyBusy,,1445521466672.8540f2da4a89c8212d3504a5bd370092.
2015-11-15 03:20:28,611 INFO  [StoreCloserThread-Yinzi,\x00\x00\x04[\x00\x00\x00\x00\x00\xEB\xF8\xAFV?\x07h\x0EQY\x82,1447416547629.acf35b28765dfe1eca89060f6ccbe1f8.-1]
regionserver.HStore: Closed Fam
2015-11-15 03:20:28,611 ERROR [RS_CLOSE_REGION-data05:16020-2] regionserver.HRegion: Memstore
size is 56591096
2015-11-15 03:20:28,611 INFO  [RS_CLOSE_REGION-data05:16020-2] regionserver.HRegion: Closed
Yinzi,\x00\x00\x04[\x00\x00\x00\x00\x00\xEB\xF8\xAFV?\x07h\x0EQY\x82,1447416547629.acf35b28765dfe1eca89060f6ccbe1f8.
2015-11-15 03:20:29,275 INFO  [StoreCloserThread-Yinzi,\x00\x00\x00\x01\x00\x00\x00\x00\x02\xB6\xFD\x94V@\xE9\xC1\xF0>c},1447378683791.eaa649ac827e79e8d749a9de08efa3c5.-1]
regionserver.HStore: Closed Fam



2015-11-15 03:20:32,229 ERROR [regionserver/data05/10.65.65.5:16020] wal.ProtobufLogWriter:
Got IOException while writing trailer
java.io.IOException: All datanodes 10.65.65.5:50010 are bad. Aborting...
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1137)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:933)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:487)
2015-11-15 03:20:32,243 ERROR [regionserver/data05/10.65.65.5:16020] regionserver.HRegionServer:
Shutdown / close of WAL failed: java.io.IOException: All datanodes 10.65.65.5:50010 are bad.
Aborting...
2015-11-15 03:20:32,343 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.Leases:
regionserver/data05/10.65.65.5:16020 closing leases
2015-11-15 03:20:32,344 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.Leases:
regionserver/data05/10.65.65.5:16020 closed leases
2015-11-15 03:20:32,344 INFO  [regionserver/data05/10.65.65.5:16020] hbase.ChoreService: Chore
service for: data05,16020,1447378673058 had [[ScheduledChore: Name: MovedRegionsCleaner for
region data05,16020,1447378673058 Period: 120000 Unit: MILLISECONDS], [ScheduledChore: Name:
data05,16020,1447378673058-MemstoreFlusherChore Period: 10000 Unit: MILLISECONDS]] on shutdown
2015-11-15 03:20:33,212 INFO  [RS_OPEN_META-data05:16020-0-MetaLogRoller] regionserver.LogRoller:
LogRoller exiting.
2015-11-15 03:20:33,220 INFO  [regionserver/data05/10.65.65.5:16020] client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x150f723dc5c0292
2015-11-15 03:20:33,226 INFO  [regionserver/data05/10.65.65.5:16020] zookeeper.ZooKeeper:
Session: 0x150f723dc5c0292 closed
2015-11-15 03:20:33,226 INFO  [regionserver/data05/10.65.65.5:16020] ipc.RpcServer: Stopping
server on 16020
2015-11-15 03:20:33,226 INFO  [RpcServer.listener,port=16020] ipc.RpcServer: RpcServer.listener,port=16020:
stopping
2015-11-15 03:20:33,227 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopped
2015-11-15 03:20:33,227 INFO  [RpcServer.responder] ipc.RpcServer: RpcServer.responder: stopping
2015-11-15 03:20:33,228 INFO  [regionserver/data05/10.65.65.5:16020-EventThread] zookeeper.ClientCnxn:
EventThread shut down
2015-11-15 03:20:33,274 INFO  [regionserver/data05/10.65.65.5:16020] zookeeper.ZooKeeper:
Session: 0x250f723dc6002a6 closed
2015-11-15 03:20:33,274 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
2015-11-15 03:20:33,274 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.HRegionServer:
stopping server data05,16020,1447378673058; zookeeper connection closed.
2015-11-15 03:20:33,274 INFO  [regionserver/data05/10.65.65.5:16020] regionserver.HRegionServer:
regionserver/data05/10.65.65.5:16020 exiting
2015-11-15 03:20:33,298 ERROR [main] regionserver.HRegionServerCommandLine: Region server
exiting
java.lang.RuntimeException: HRegionServer Aborted
        at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:68)
        at org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:87)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:126)
        at org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:2641)
2015-11-15 03:20:33,299 INFO  [Thread-5] regionserver.ShutdownHook: Shutdown hook starting;
hbase.shutdown.hook=true; fsShutdownHook=org.apache.hadoop.fs.FileSystem$Cache$ClientFinalizer@8216547
2015-11-15 03:20:33,299 INFO  [Thread-5] regionserver.ShutdownHook: Starting fs shutdown hook
thread.
2015-11-15 03:20:33,305 INFO  [Thread-5] regionserver.ShutdownHook: Shutdown hook finished.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message