hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Boenzli <patrick.boen...@soom-it.ch>
Subject Hadoop YARN 2.2.0 Streaming Memory Limitation
Date Mon, 24 Feb 2014 09:27:27 GMT
hello hadoop-users!

We are currently facing a frustrating hadoop streaming memory problem. our setup:

our compute nodes have about 7 GB of RAM
hadoop streaming starts a bash script wich uses about 4 GB of RAM
therefore it is only possible to start one and only one task per node
out of the box each hadoop instance starts about 7 hadoop containers with default hadoop settings.
each hadoop task forks a bash script that need about 4 GB of RAM, the first fork works, all
following fail because they run out of memory. so what we are looking for is to limit the
number of containers to only one. so what we found on the internet:

yarn.scheduler.maximum-allocation-mb and mapreduce.map.memory.mb is set to values such that
there is at most one container. this means, mapreduce.map.memory.mb must be more than half
of the maximum memory (otherwise there will be multiple containers).
done right, this gives us one container per node. but it produces a new problem: since our
java process is now using at least half of the max memory, our child (bash) process we fork
will inherit the parent memory footprint and since the memory used by our parent was more
than half of total memory, we run out of memory again. if we lower the map memory, hadoop
will allocate 2 containers per node, which will run out of memory too.

since this problem is a blocker in our current project we are evaluating adapting the source
code to solve this issue. as a last resort. any ideas on this are very much welcome.

we would be very happy for any help offered! 

PS: We asked this question also on stackoverflow three days ago (http://stackoverflow.com/questions/21933937/hadoop-2-2-0-streaming-memory-limitation).
no answer yet. If there should be any answers in one of the forms we will sync the answers.
View raw message