hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varun Vasudev <vvasu...@hortonworks.com>
Subject Re: Using YARN with native applications
Date Wed, 27 May 2015 13:53:34 GMT
For CPU isolation, you have to use Cgroups with the LinuxContainerExecutor. We don’t enforce
cpu limits with the DefaultContainerExecutor.


From: Kevin
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 7:06 PM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the tip. In the trunk it looks like the NodeManager's monitor thread doesn't care
if the process tree's cores overflows the container's CPU limit. Is this monitored elsewhere?

I have my eyes on https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonitorImpl.java#L476

On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vvasudev@hortonworks.com<mailto:vvasudev@hortonworks.com>>
You should also look at ProcfsBasedProcessTree if you want to know how exactly the memory
usage is being calculated.


From: Kevin
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Date: Wednesday, May 27, 2015 at 6:22 PM

To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Varun, thank you for helping me understand this. You pointed out a couple of new things to
me. I finally found that monitoring thread in the code (ContainersMonitorImpl.java). I can
now see and gain a better understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vvasudev@hortonworks.com<mailto:vvasudev@hortonworks.com>>
YARN should kill the container. I’m not sure what JVM you’re referring to, but the NodeManager
writes and then spawns a shell script that will invoke your shell script which in turn(presumably)
will invoke your C++ application. A monitoring thread then looks at the memory usage of the
process tree and compares it to the limits for the container.


From: Kevin
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Date: Tuesday, May 26, 2015 at 7:22 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Re: Using YARN with native applications

Thanks for the reply, Varun. So if I use the DefaultContainerExecutor and run a C++ application
via a shell script inside a container whose virtual memory limit is, for example, 2 GB, and
that application does a malloc for 3 GB, YARN will kill the container? I always just thought
that YARN kept its eye on the JVM it spins up for the container (under the DefaultContainerExecutor).


On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <vvasudev@hortonworks.com<mailto:vvasudev@hortonworks.com>>
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage of containers. Containers
that exceed either limit are killed. Admins can disable the checks by setting yarn.nodemanager.pmem-check-enabled
and/or yarn.nodemanager.vmem-check-enabled to false. The virtual memory limit for a container
is determined using the config variable yarn.nodemanager.vmem-pmem-ratio(default value is

In case of vcores -

  1.  If you’re using Cgroups under LinuxContainerExecutor, by default, if there is spare
CPU available on the node, your container will be allowed to use it. Admins can restrict containers
to use only the CPU allocated to them by setting yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage
to true. This setting is only applicable when using Cgroups under LinuxContainerExecutor.
  2.   If you aren’t using Cgroups under LinuxContainerExecutor, there is no limiting of
the amount of the CPU that containers can use.


From: Kevin
Reply-To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Date: Friday, May 22, 2015 at 3:30 AM
To: "user@hadoop.apache.org<mailto:user@hadoop.apache.org>"
Subject: Using YARN with native applications


I have been using the distributed shell application and Oozie to run native C++ applications
in the cluster. Is YARN able to see the resources these native applications use. For example,
if I use Oozie's shell action, the NodeManager hosts the mapper container and allocates a
certain amount of memory and vcores (as configured). What happens if my C++ application uses
more memory or vcores than the NodeManager allocated?

I was looking in the Hadoop code and I couldn't find my way to answer. Although, it seems
the LinuxContainerExecutor may be the answer to my question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications running inside of it.


View raw message