hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Vasudev <vvasu...@apache.org>
Subject Re: YarnChild and Container running beyond physical memory limits
Date Sun, 01 May 2016 08:46:14 GMT
Hi Joseph,

YarnChild is a wrapper around the MR task process that actually carry out the work on the
machine. From YarnChild.java -
/**
 * The main() for MapReduce task processes.
 */

In the snippets you provided, the memory monitor for YARN killed the map tasks because they
exceeded the allocated memory - 
Container [pid=30518,containerID=container_1460573911020_0002_01_000033] is running beyond
physical memory limits. Current usage: 6.6 GB of 2.9 GB physical memory used; 17.6 GB of 11.7
GB virtual memory used. Killing container.
And
Container [pid=10124,containerID=container_1460478789757_0001_01_000020] is running beyond
physical memory limits. Current usage: 5.4 GB of 5 GB physical memory used; 8.4 GB of 20 GB
virtual memory used. Killing container.
> and it's always due to some other unrelated external process chewing up RAM.
This should not be the case. The way YARN determines memory usage is by walking down the process
tree of the container. We don’t look at memory being used by external processes.
I would recommend increasing the amount of memory allocated for your map tasks until the job
finishes(to figure out the upper limit of your map tasks) and going through your map code
to see where it’s possible for memory usage to spike.
-Varun

From:  Joseph Naegele <jnaegele@grierforensics.com>
Date:  Thursday, April 14, 2016 at 5:10 AM
To:  <user@hadoop.apache.org>
Subject:  YarnChild and Container running beyond physical memory limits

Hi!

 

Can anyone tell me what exactly YarnChild is and how I can control the quantity of child JVMs
running in each container? In this case I'm concerned with the map phase of my MR job. I'm
having issues with my containers running beyond *physical* memory limits and I'm trying to
determine the cause.

 

Is each child JVM just an individual map task? If so, why do I see a variable number of them?
I don't know if each of these JVMs is a clone of the original YarnChild process, what they
are doing, why they are each using so much memory (1G).

 

Here is a sample excerpt of my MR job when YARN kills a container: https://gist.githubusercontent.com/naegelejd/ad3a58192a2df79775d80e3eac0ae49c/raw/808f998b1987c77ba1fe7fb41abab62ae07c5e02/job.log

Here's the same process tree reorganized and ordered by ancestry: https://gist.githubusercontent.com/naegelejd/37afb27a6cf16ce918daeaeaf7450cdc/raw/b8809ce023840799f2cbbee28e49930671198ead/job.clean.log

 

If I increase the amount of memory per container, in turn lowering the total number of containers,
I see these errors less often as expected, BUT when I do see them, there are NO child JVM
processes and it's always due to some other unrelated external process chewing up RAM. Here
is an example of that: https://gist.githubusercontent.com/naegelejd/32d63b0f9b9c148d1c1c7c0de3c2c317/raw/934a93a7afe09c7cd62a50edc08ce902b9e71aac/job.log.
You can see that the [redacted] process is the culprit in that case.

 

I can share my mapred/yarn configuration if it's helpful.

 

If anyone has any ideas I'd greatly appreciate them!

 

Thanks,

Joe


Mime
View raw message