hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: why region server exited ?
Date Tue, 04 Sep 2012 04:30:13 GMT
Please say what version of hbase and pastebin the log rather than copy
it into mail.

Looking at the below it seems like we are missing actual reason server
aborted.  This is telling '....after 32254 ms, since caller
disconnected...' seemingly saying the server halted -- GCing? -- but
there is no  definitive in the below log that I can see.

Thanks,
St.Ack

On Mon, Sep 3, 2012 at 8:44 PM, ablozhou@gmail.com <ablozhou@gmail.com> wrote:
> log:
> ...
>
> 2012-09-01 20:20:25,672 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":11637,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502014035,"responsesize":86246,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00'","queuetimems":115189,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:20:25,673 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":10152,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502010751,"responsesize":86202,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00%","queuetimems":119166,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:20:38,338 DEBUG
> org.apache.hadoop.hbase.util.CompoundBloomFilterWriter: Compacted
> Bloom chunk #0 from [109306 max keys, 131072 bytes] to [13663 max
> keys, 16384 bytes]
> 2012-09-01 20:20:41,874 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile: Bloom added to HFile
> (hdfs://gaia/hbase/tsdb.thrift/9ca866b99b8b6739855179e69c4dd598/.tmp/9832ee37fb82447c87c7cd7066ead94a):
> org.apache.hadoop.hbase.util.CompoundBloomFilterWriter@5f55ace7
> 2012-09-01 20:20:41,902 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROW
> CompoundBloomFilter metadata for 9832ee37fb82447c87c7cd7066ead94a
> 2012-09-01 20:20:41,903 DEBUG
> org.apache.hadoop.hbase.regionserver.Store: Renaming flushed file at
> hdfs://gaia/hbase/tsdb.thrift/9ca866b99b8b6739855179e69c4dd598/.tmp/9832ee37fb82447c87c7cd7066ead94a
> to hdfs://gaia/hbase/tsdb.thrift/9ca866b99b8b6739855179e69c4dd598/t/9832ee37fb82447c87c7cd7066ead94a
> 2012-09-01 20:20:58,362 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile$Reader: Loaded ROW
> CompoundBloomFilter metadata for 9832ee37fb82447c87c7cd7066ead94a
> 2012-09-01 20:20:58,383 WARN
> org.apache.hadoop.hbase.regionserver.wal.HLog: IPC Server handler 9 on
> 60020 took 2631 ms appending an edit to hlog; editcount=13680,
> len~=59.9k
> 2012-09-01 20:20:58,384 INFO
> org.apache.hadoop.hbase.regionserver.Store: Added
> hdfs://gaia/hbase/tsdb.thrift/9ca866b99b8b6739855179e69c4dd598/t/9832ee37fb82447c87c7cd7066ead94a,
> entries=530086, sequenceid=33741681, memsize=109.2m, filesize=18.9m
> 2012-09-01 20:21:02,761 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush
> of ~109.2m/114498576, currentsize=36.2m/38005632 for region
> tsdb.thrift,\x00\x00\x07PA\xDC\xA0\x00\x00\x01\x00\x10\x0D,1346497651829.9ca866b99b8b6739855179e69c4dd598.
> in 283731ms, sequenceid=33741681, compaction requested=true
> 2012-09-01 20:21:02,762 DEBUG
> org.apache.hadoop.hbase.regionserver.Store: Skipped compaction of t.
> Only 2 file(s) of size 33.0m have met compaction criteria.
> 2012-09-01 20:21:02,762 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion: Started memstore flush
> for tsdb.thrift,\x00\x00\x07PA\xDC\xA0\x00\x00\x01\x00\x10\x0D,1346497651829.9ca866b99b8b6739855179e69c4dd598.,
> current region memstore size 36.2m
> 2012-09-01 20:21:02,762 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":10541,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502045188,"responsesize":87654,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE2","queuetimems":14844,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,762 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":17570,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502045190,"responsesize":87654,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE3","queuetimems":14840,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,762 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":13709,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502049052,"responsesize":87830,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE5","queuetimems":18699,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,762 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":10544,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502045186,"responsesize":87874,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE0","queuetimems":14844,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,762 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":13196,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502045187,"responsesize":87786,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE1","queuetimems":14844,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,762 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":10432,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502052329,"responsesize":87742,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE6","queuetimems":21975,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,781 WARN org.apache.hadoop.ipc.HBaseServer:
> (operationTooSlow):
> {"processingtimems":17574,"client":"10.79.48.253:54813","timeRange":[0,9223372036854775807],"starttimems":1346502045190,"responsesize":87786,"class":"HRegionServer","table":"tsdb.thrift","cacheBlocks":true,"families":{"t":["ALL"]},"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x00\\xE4","queuetimems":14839,"method":"get","totalColumns":1,"maxVersions":1}
> 2012-09-01 20:21:02,797 DEBUG
> org.apache.hadoop.hbase.regionserver.HRegion: Finished snapshotting
> tsdb.thrift,\x00\x00\x07PA\xDC\xA0\x00\x00\x01\x00\x10\x0D,1346497651829.9ca866b99b8b6739855179e69c4dd598.,
> commencing wait for mvcc, flushsize=38635920
> 2012-09-01 20:21:10,570 DEBUG
> org.apache.hadoop.hbase.io.hfile.HFileWriterV2: Initialized with
> CacheConfig:enabled [cacheDataOnRead=true] [cacheDataOnWrite=false]
> [cacheIndexesOnWrite=false] [cacheBloomsOnWrite=false]
> [cacheEvictOnClose=false] [cacheCompressed=false]
> 2012-09-01 20:21:10,570 INFO
> org.apache.hadoop.hbase.regionserver.StoreFile: Bloom filter type for
> hdfs://gaia/hbase/tsdb.thrift/9ca866b99b8b6739855179e69c4dd598/.tmp/2464ffa2d4aa4644aaace1c217ee634c:
> ROW, CompoundBloomFilterWriter
> 2012-09-01 20:21:10,571 WARN org.apache.hadoop.ipc.HBaseServer: IPC
> Server Responder, call get([B@81dd038,
> {"timeRange":[0,9223372036854775807],"totalColumns":1,"cacheBlocks":true,"families":{"t":["ALL"]},"maxVersions":1,"row":"\\x00\\x00\\x0CPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x03`"}),
> rpc version=1, client version=29, methodsFingerPrint=0 from
> 10.79.48.253:54813: output error
> 2012-09-01 20:21:10,572 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> org.apache.hadoop.hbase.ipc.CallerDisconnectedException: Aborting call
> get([B@13b33a26,
> {"timeRange":[0,9223372036854775807],"totalColumns":1,"cacheBlocks":true,"families":{"t":["ALL"]},"maxVersions":1,"row":"\\x00\\x00\\x0DPA\\xEA\\xB0\\x00\\x00\\x01\\x00\\x01&"}),
> rpc version=1, client version=29, methodsFingerPrint=0 from
> 10.79.48.253:54813 after 32254 ms, since caller disconnected
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Call.throwExceptionIfCallerDisconnected(HBaseServer.java:406)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:2960)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:2925)
>         at org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:2942)
>         at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:3747)
>         at org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:3639)
>         at org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1785)
>         at sun.reflect.GeneratedMethodAccessor93.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336)
>
>
> the client is async hbase of opentsdb.
>
> async hbase log:
>
> 012-09-01 20:20:30,768 ERROR [New I/O  worker #18] RegionClient:
> Unexpected exception from downstream on [id: 0x0c2f48d2,
> /10.79.48.253:54813 => /10.79.48.207:60020]
> 2012-09-01 20:22:28,602 ERROR [New I/O  worker #18] RegionClient:
> After decoding the last message on [id: 0x0c2f48d2,
> /10.79.48.253:54813 :> /10.79.48.207:60020], there was still some
> undecoded bytes in the channel's buffer (which are going to be lost)
>
>
> but async hbase is still alive.
>
>
> any suggestions?
>
> thanks!
>
> andy

Mime
View raw message