hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wayne <wav...@gmail.com>
Subject JVM OOM
Date Wed, 05 Jan 2011 16:10:18 GMT
I am still struggling with the JVM. We just had a hard OOM crash of a region
server after only running for 36 hours. Any help would be greatly
appreciated. Do we need to restart nodes every 24 hours under load?  GC
Pauses are something we are trying to plan for, but full out OOM crashes are
a new problem.

The message below seems to be where it starts going bad. It is followed by
no less than 63 Concurrent Mode Failure errors over a 16 minute period.

*GC locker: Trying a full collection because scavenge failed*

Lastly here is the end (after the 63 CMF errors).

Heap
 par new generation   total 1887488K, used 303212K [0x00000005fae00000,
0x000000067ae00000, 0x000000067ae00000)
  eden space 1677824K,  18% used [0x00000005fae00000, 0x000000060d61b078,
0x0000000661480000)
  from space 209664K,   0% used [0x000000066e140000, 0x000000066e140000,
0x000000067ae00000)
  to   space 209664K,   0% used [0x0000000661480000, 0x0000000661480000,
0x000000066e140000)
 concurrent mark-sweep generation total 6291456K, used 2440155K
[0x000000067ae00000, 0x00000007fae00000, 0x00000007fae00000)
 concurrent-mark-sweep perm gen total 31704K, used 18999K
[0x00000007fae00000, 0x00000007fccf6000, 0x0000000800000000)

Here again are our custom settings in case there are some suggestions out
there. Are we making it worse with these settings? What should we try next?

        -XX:+UseCMSInitiatingOccupancyOnly
        -XX:CMSInitiatingOccupancyFraction=60
        -XX:+CMSParallelRemarkEnabled
        -XX:SurvivorRatio=8
        -XX:NewRatio=3
        -XX:MaxTenuringThreshold=1


Thanks!

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message