hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2506) Too easy to OOME a RS
Date Sun, 28 Nov 2010 03:30:38 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964493#action_12964493
] 

Ted Yu commented on HBASE-2506:
-------------------------------

The original intention of my suggestion was to avoid region server over-sleeping due to long
GC pause.
>From our staging cluster, I found there was a threshold for number of mappers per node
(we run map/reduce along side region servers) above which some region server(s) would sleep
due to long GC pause.
I think the load balancer should do the following in order to alleviate GC pause:
. keep track of most frequently accessed regions so that they can spread/move to more region
servers
. consider memory pressure when making region move decision


> Too easy to OOME a RS
> ---------------------
>
>                 Key: HBASE-2506
>                 URL: https://issues.apache.org/jira/browse/HBASE-2506
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Jean-Daniel Cryans
>            Priority: Blocker
>             Fix For: 0.92.0
>
>
> Testing a cluster with 1GB heap, I found that we are letting the region servers kill
themselves too easily when scanning using pre-fetching. To reproduce, get 10-20M rows using
PE and run a count in the shell using CACHE => 30000 or any other very high number. For
good measure, here's the stack trace:
> {code}
> 2010-04-30 13:20:23,241 FATAL org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
aborting.
> java.lang.OutOfMemoryError: Java heap space
>         at java.util.Arrays.copyOf(Arrays.java:2786)
>         at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:94)
>         at java.io.DataOutputStream.write(DataOutputStream.java:90)
>         at org.apache.hadoop.hbase.client.Result.writeArray(Result.java:478)
>         at org.apache.hadoop.hbase.io.HbaseObjectWritable.writeObject(HbaseObjectWritable.java:312)
>         at org.apache.hadoop.hbase.io.HbaseObjectWritable.write(HbaseObjectWritable.java:229)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:941)
> 2010-04-30 13:20:23,241 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Dump
of metrics: request=0.0, regions=29, stores=29, storefiles=44, storefileIndexSize=6, memstoreSize=255,
>  compactionQueueSize=0, usedHeap=926, maxHeap=987, blockCacheSize=1700064, blockCacheFree=205393696,
blockCacheCount=0, blockCacheHitRatio=0
> {code}
> I guess the same could happen with largish write buffers. We need something better than
OOME.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message