hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MONTMORY Alain <alain.montm...@thalesgroup.com>
Subject How control JVM size used with Capacity Scheduler
Date Mon, 12 Nov 2012 18:30:50 GMT
Hello everybody,

We are use using CapacityScheduler and Hadoop 0.20.2 (cdh3U3) on a cluster composed of nodes(20)
with :

ONE-NODE=16 core, 24 GB memory (AMD 4274), JVM HotSpot  1.7.0_05

We execute scientific Job in Map/Reduce task (using Cascading 1.2). We use CapacityScheduler
to avoid memory consumption Jobs to shut down the nodes through excessive swap usage...

So we setup memory slot in JobConf (one slot=1.5 GB...) (Job1=1Slot, job2=3slots etc...) according
to program needs.
In JobConf.xml we setup in java.opts JVM options like Xmx=xxx to control (try to...) the memory
effectively used by the whole JVM.
By default on the kind of computer(see ONE-NODE above) we use, the JVM setup 13 GarbageCollector
Thread and 80 MB of memory per Thread and so reserve (13x80Mb = 1040 Mb) per JVM. We know
that there is option to control both GCThreadNumber and GCMemoryPerThread and we used it...

Despite the use of -XX:ParallelGCThreads=X and -XX:HeapSizePerGCThread=YY  JVM options to
try to control JVM full process size, some jobs are killed  because Map (and/or) Reduce tasks
are killed by Capacity scheduler with messages like :
Task Tree [pid=xx, tipId=xx] is running beyond memory limit Current Usage=xxx. Limit=xxxx.

Does someone already deals with such problem using CapacityScheduler and know which JVM options
has to be used to control JVM process size...( we try MaxPermSize , Xss with no tanglible

Thank you for your response




View raw message