hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From leith <ele...@diffbot.com>
Subject Re: compaction thread always hitting exceptions
Date Wed, 23 Jul 2008 00:12:16 GMT
stack wrote:
> Thanks for writing this note.  I'd seen the IOE and presumed it a known 
> issue in hadoop 0.17.  I then went looking for it but couldn't find 
> one.  Turns out its a silly issue in hbase (HBASE-761).
> 
> How many regions you have loaded up in your regionserver?
> 
> You are using default heap size?  You might try upping it a little.  
> Also make sure you have upped open file descriptors limit too (See #4 
> and #6 in the FAQ, http://wiki.apache.org/hadoop/Hbase/FAQ).
> 

thanks St.Ack,

we've reran our tests, uncommented HEAP_SIZE=1000 in our hbase-env, and have 
updated trunk to the latest. we haven't run into OOME so far, and no sight of 
all those compaction errors anymore (thanks!).

we have 5 regions loaded up in our regionserver

our test hammers hbase as we import a bunch of files into our hbase tables. now, 
what we see is that hbase seems to be blocking such that nothing gets imported. 
everything is up and running (hdfs namenode/datanodes, hbase master/regionserver)

below, are the logs we see from our hbase region server and from the hdfs datanode

----------------------------------
hbase regionserver
---------------------------------
2008-07-22 16:37:10,419 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 396381, skipped 1, 315153
2008-07-22 16:37:10,672 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 258344, skipped 1, 204628
2008-07-22 16:37:10,856 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 342049, skipped 1, 271105
2008-07-22 16:37:11,080 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 237573, skipped 1, 188148
2008-07-22 16:37:11,280 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 78030028, skipped 1, 39326312
2008-07-22 16:37:33,464 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 366849, skipped 1, 291596
2008-07-22 16:37:33,928 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 984138, skipped 2, 897309
2008-07-22 16:37:34,201 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 67294, skipped 1, 51834
2008-07-22 16:37:34,643 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 45569, skipped 2, 41020
2008-07-22 16:37:35,061 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
compaction completed on region dmls,,1216768730386 in 25sec
2008-07-22 16:38:00,151 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
starting compaction on region dmls,,1216768730386
2008-07-22 16:38:00,157 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 413508, skipped 2, 396083
2008-07-22 16:38:00,502 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 269288, skipped 2, 258024
2008-07-22 16:38:00,789 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 356584, skipped 2, 341743
2008-07-22 16:38:01,032 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 247637, skipped 2, 237273
2008-07-22 16:38:01,337 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 83129653, skipped 0, 0
2008-07-22 16:38:42,495 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 382791, skipped 2, 366533
2008-07-22 16:38:42,724 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 1024454, skipped 2, 897309
2008-07-22 16:38:42,959 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 70464, skipped 2, 66950
2008-07-22 16:38:43,377 INFO org.apache.hadoop.hbase.regionserver.HStore: 
Compaction size 47891, skipped 2, 41020
2008-07-22 16:38:43,716 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
compaction completed on region dmls,,1216768730386 in 43sec
2008-07-22 16:38:53,954 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 4 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:40:03,931 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 9 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:41:13,943 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 1 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:42:23,955 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 0 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:43:33,968 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 2 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:44:34,095 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 5 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:45:44,109 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 3 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:46:54,122 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 8 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:48:04,135 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 6 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size
2008-07-22 16:48:18,116 INFO org.apache.hadoop.hbase.regionserver.LogRoller: 
Rolling hlog. Number of entries: 26294
2008-07-22 16:48:18,133 INFO org.apache.hadoop.hbase.regionserver.HLog: New log 
writer created at /hbase/log_64.62.244.2_1216768695534_60020/hlog.dat.1216770498127
2008-07-22 16:49:14,165 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Blocking updates for 'IPC Server handler 7 on 60020' on region 
dmls,,1216768730386: Memcache size 64.0m is >= than blocking 64.0m size


---------------------------------------
hdfs datanode
--------------------------------------

2008-07-22 17:10:41,828 WARN org.apache.hadoop.dfs.DataNode: 
64.62.244.2:50010:Got exception while serving blk_-355911506373371046 to 
/64.62.244.2:
java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcher.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
    at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:53)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)
    at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:144)
    at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:105)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.dfs.DataNode$BlockSender.sendChunks(DataNode.java:1774)
    at org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1813)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:1039)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:968)
    at java.lang.Thread.run(Thread.java:619)

2008-07-22 17:10:41,828 ERROR org.apache.hadoop.dfs.DataNode: 
64.62.244.2:50010:DataXceiver: java.io.IOException: Connection reset by peer
    at sun.nio.ch.FileDispatcher.write0(Native Method)
    at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
    at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
    at sun.nio.ch.IOUtil.write(IOUtil.java:75)
    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
    at 
org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:53)
    at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:140)
    at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:144)
    at org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:105)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
    at java.io.DataOutputStream.write(DataOutputStream.java:90)
    at org.apache.hadoop.dfs.DataNode$BlockSender.sendChunks(DataNode.java:1774)
    at org.apache.hadoop.dfs.DataNode$BlockSender.sendBlock(DataNode.java:1813)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.readBlock(DataNode.java:1039)
    at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:968)
    at java.lang.Thread.run(Thread.java:619)

2008-07-22 17:10:41,834 INFO org.apache.hadoop.dfs.DataNode: 64.62.244.2:50010 
Served block blk_8154627018748859939 to /64.62.244.2
2008-07-22 17:10:41,843 INFO org.apache.hadoop.dfs.DataNode: 64.62.244.2:50010 
Served block blk_6062812905615890844 to /64.62.244.2
2008-07-22 17:10:41,987 INFO org.apache.hadoop.dfs.DataNode: 64.62.244.2:50010 
Served block blk_-3883168343557890264 to /64.62.244.2



> St.Ack
> 
> leith wrote:
>> looking at our region logs, we've noticed that the compaction thread 
>> constantly runs into exceptions. the entire log is filled with 
>> something like this:
>>
>> ----------------------------------
>> 2008-07-22 12:29:52,759 WARN 
>> org.apache.hadoop.hbase.regionserver.HStore: Exception closing reader 
>> for 242866774/new
>> java.io.IOException: Stream closed
>>    at 
>> org.apache.hadoop.dfs.DFSClient$DFSInputStream.close(DFSClient.java:1319)
>>    at java.io.FilterInputStream.close(FilterInputStream.java:155)
>>    at 
>> org.apache.hadoop.io.SequenceFile$Reader.close(SequenceFile.java:1581)
>>    at org.apache.hadoop.io.MapFile$Reader.close(MapFile.java:577)
>>    at 
>> org.apache.hadoop.hbase.regionserver.HStore.closeCompactionReaders(HStore.java:917)

>>
>>    at 
>> org.apache.hadoop.hbase.regionserver.HStore.compactHStoreFiles(HStore.java:910) 
>>
>>    at 
>> org.apache.hadoop.hbase.regionserver.HStore.compact(HStore.java:787)
>>    at 
>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:887) 
>>
>>    at 
>> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:847) 
>>
>>    at 
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:84)

>>
>> -------------------------------------
>>
>> the regionserver is taking about up a good amount of memory on our 
>> system, and nothing is happening except for i assume compaction/split 
>> processes.
>>
>> these only seem to be warnings, but there is so many of them, it would 
>> be nice to get a second opinion on this.
>>
>> we've also gotten an 'out of memory' exception a few times from the 
>> compaction thread, and those actually ended up killing the thread, 
>> resulting in the region server shutting itself down.
>>
>> thanks,
>>
>> /leith
>>
> 
> 

Mime
View raw message