hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: memory issue when inserting large data into existed large table
Date Fri, 05 Mar 2010 18:30:29 GMT
The reason a lot of heap is used after a job is that the memstores are
probably all filed and, seeing the number of regions you have per
region server, it could be a lot... tho the upper limit is 40% of all
available heap (1.5GB in your case). Also we ship HBase with the CMS
garbage collector by default which uses a lot more memory in order to
lower major GC pauses.

I would recommend:

- Use more heap, 4GB is good
- Compress your table. See http://wiki.apache.org/hadoop/UsingLzoCompression

J-D

On Fri, Mar 5, 2010 at 12:45 AM, javier liang <javier.liang@gmail.com> wrote:
> Hi,
>
> How do you insert large data into existed table?
>
>  We're using Hadoop 0.20.1 and Hbase 0.20.3. We have a large table in Hbase,
> and we want to insert some new large data into it. We write a Map/Reduce
> program like the SampleUploader example of HBase to do this.
>
> The problem is when running this Map/Reduce program, through web ui of hbase
> master, we see "usedHeap" keep increasing, and after the Map/Reduce program
> finishes, the used heap on every regionserver doesn't decrese. Sometimes for
> some regionserver, when the inserted data is large enough, "usedHeap"
> reaches "maxHeap", and then cause OutOfMemoryException and shutdown the
> regionserver.
>
> Our hbase cluster has 18 regionservers, each with 1.5G maxHeap currently.
> The hadoop cluster has 36 nodes, everytime the Map/Reduce program runs, it
> spawns 70 reduce tasks. Every reduce task uses BatchUpdate to insert data,
> same as SampleUploader.
>
> I'm wondering who used up the heap and why the used memory is never freed.
>
> From logs of regionservers, we find there're lots of compaction work done as
> we keep inserting rows into existed table. Are they causing the problem?
>
> What's worse, sometimes we fail to restart hbase after many regionservers
> shut down, because no regionserver can compact a big region. When they do
> this, the usedHeap bursts and reaches maxHeap, then OutOfMemoryException
> appears in log, and regionserver shuts down again. Two pieces of log about
> this are list below.
>
> Will compaction for one region cost about more than 1G memory?
>
> Following are regionserver log when failing to start hbase, looks it failed
> when compacting a big region(with about 1.5 million columns):
> ..........
> 2010-03-05 07:09:43,010 DEBUG
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction
> requested for region hbt2table2,all_2010/01/18,1267031918513/1022488138
> because: Region has too many store files
> 2010-03-05 07:09:43,011 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> Starting compaction on region hbt2table2,all_2010/01/18,1267031918513
> 2010-03-05 07:09:43,063 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> Major compaction triggered on store queries; time since last major
> compaction 171705405ms
> 2010-03-05 07:09:43,078 DEBUG org.apache.hadoop.hbase.regionserver.Store:
> Started compaction of 7 file(s)  into
> /user/ccenterq/hbase/hbt2table2/compaction.dir/1022488138, seqid=1851247058
> 2010-03-05 07:09:44,041 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
> word_in_query,u=http\x25253a\x25252f\x25252fwww.youtube.com\x25252fwatch\x25253fv\x25253dzqo3fwqeyz0
> a 2010/02/04,1267692884527
> 2010-03-05 07:09:44,041 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: MSG_REGION_OPEN:
> word_in_query,www.audacity.com a 2010/02/09,1267692162765
> ......
> 2010-03-05 07:09:51,620 INFO org.apache.hadoop.hbase.regionserver.HRegion:
> region hbt2table2,all_2010/02/10,1267094503597/1803361418 available;
> sequence id is 1374188600
> 2010-03-05 07:10:46,900 DEBUG
> org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes:
> Total=1.402832MB (1470976), Free=297.32217MB (311764896), Max=298.725MB
> (313235872), Counts: Blocks=1, Access=17397, Hit=35, Miss=17362,
> Evictions=0, Evicted=0, Ratios: Hit Ratio=0.20118411630392075%, Miss
> Ratio=99.79881644248962%, Evicted/Run=NaN
> 2010-03-05 07:11:26,897 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0.0, regions=359, stores=359, storefiles=558,
> storefileIndexSize=100, memstoreSize=0, compactionQueueSize=0,
> usedHeap=1493, maxHeap=1493, blockCacheSize=1470976,
> blockCacheFree=311764896, blockCacheCount=1, blockCacheHitRatio=0,
> fsReadLatency=0, fsWriteLatency=0, fsSyncLatency=0
> 2010-03-05 07:11:29,474 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: Set stop flag in
> regionserver/10.76.16.90:60020.compactor
> java.lang.OutOfMemoryError: Java heap space
>    at java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:39)
>    at java.nio.ByteBuffer.allocate(ByteBuffer.java:312)
>    at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1019)
>    at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:971)
>    at
> org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java:1163)
>    at
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileScanner.java:58)
>    at
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:79)
>    at
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:189)
>    at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:893)
>    at org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:756)
>    at
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:783)
>    at
> org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.java:736)
>    at
> org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:93)
> 2010-03-05 07:11:29,485 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 60020
> 2010-03-05 07:11:29,500 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 10 on 60020: exiting
> .........
>
> Thanks in advance for your time!
>

Mime
View raw message