hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinsong Hu" <jinsong...@hotmail.com>
Subject gc setting for hadoop
Date Mon, 22 Nov 2010 17:59:03 GMT
Hi, There:
  I have been searching good gc setting for hadoop namenode and datanode, 
and I use this setting for
namenode and data node:

 -XX:NewSize=18m -XX:MaxNewSize=18m -XX:+
HeapDumpOnOutOfMemoryError -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 
 -XX:+CMSClassUnloadingEnabled -XX:+DisableExplicitGC -XX:+UseCompressedOops 
 -XX:+DoEscapeAnalysis -XX:+AggressiveOpts -verbose:gc -XX:+PrintGCDetails -
XX:+PrintGCTimeStamps -Xmx3G -Dcom.sun.management.jmxremote.port=8004 -Xloggc:/usr/lib/hadoop/logs/gc-namenode.log

 -XX:ParallelGCThreads=8 -XX:PermSize=256m -XX

The problem , however, is that I see frequent CMS gc that happens every 100 
second. and  2/3 of the GC says things like this:

 CMS: abort preclean due to time 795.999: 
[CMS-concurrent-abortable-preclean: 0.
188/5.082 secs] [Times: user=0.15 sys=0.00, real=5.08 secs]

I searched internet and some people says that is ok. However, I would like 
to see if anybody else has a good GC setting for hadoop that produces less 
frequent GC and works well.

The advantage of the above setting is that it controls memory growth well, 
and there is no sudden full GC.
the default setting without above tunning produces very bad gc profile. If 
anybody else can share their gc tunning experience, that is really 


View raw message