Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 77EC718A96 for ; Wed, 27 May 2015 14:11:28 +0000 (UTC) Received: (qmail 8593 invoked by uid 500); 27 May 2015 14:11:08 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 8476 invoked by uid 500); 27 May 2015 14:11:08 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 8465 invoked by uid 99); 27 May 2015 14:11:08 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 May 2015 14:11:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 33E60C093B for ; Wed, 27 May 2015 14:11:08 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.9 X-Spam-Level: ** X-Spam-Status: No, score=2.9 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id BFZh2wKcN-Bw for ; Wed, 27 May 2015 14:10:56 +0000 (UTC) Received: from mail-ig0-f175.google.com (mail-ig0-f175.google.com [209.85.213.175]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id A1C7E207A7 for ; Wed, 27 May 2015 14:10:49 +0000 (UTC) Received: by igbyr2 with SMTP id yr2so85997594igb.0 for ; Wed, 27 May 2015 07:10:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :content-type; bh=QHETX9LqS91oQ59PprfGHig96DwNor1mmJF68GskV5g=; b=GwHlrRfgBBRQGL0mPaUlmn/zvDzxNW0IFa45NvJoAIT2LlC+gAqzf+uuIXr9/6joDe IO9jbHnYpv9K7QNRm/Qitcq3rDebLm7CeHWPWenqyUYHqsEdgTsfDkPaYlxJBy+cwqwt nBoyTbrLzGVbowJ9FzeQDDd5ZBCiIuHKGu4s/GJTZtVDHvd7FgM3r1mFwo+xoX8pmJZZ ozO39d6LR+fw9GmQAERFfgsd7xOYC9ncUUnYL+FO4ZeoBnwczAPCiiCDFDohzqf83zPS 1GVY5aVFebVQFGDpcLV+8lxYyIugG4b7GIAJZ3QilWzuIplYjwssQSuqmB+oaIP5HGkB czQg== X-Received: by 10.50.27.6 with SMTP id p6mr36951717igg.46.1432735849057; Wed, 27 May 2015 07:10:49 -0700 (PDT) MIME-Version: 1.0 References: <54600962-90DF-4D1C-A510-0929A75F377D@hortonworks.com> <4F4AB2B2-85B2-45D0-9AE8-37A425CCF684@hortonworks.com> <968043D8-BEB5-4C89-B2A5-EF0E62DD321A@hortonworks.com> <5427E6B8-225E-4892-96E1-1936CEE40C85@hortonworks.com> In-Reply-To: <5427E6B8-225E-4892-96E1-1936CEE40C85@hortonworks.com> From: Kevin Date: Wed, 27 May 2015 14:10:48 +0000 Message-ID: Subject: Re: Using YARN with native applications To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=047d7b10ce4757148c051710cf54 --047d7b10ce4757148c051710cf54 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Ah, okay. That makes sense. Thanks for all your help, Varun. -Kevin On Wed, May 27, 2015 at 9:53 AM Varun Vasudev wrote: > For CPU isolation, you have to use Cgroups with the > LinuxContainerExecutor. We don=E2=80=99t enforce cpu limits with the > DefaultContainerExecutor. > > -Varun > > From: Kevin > Reply-To: "user@hadoop.apache.org" > Date: Wednesday, May 27, 2015 at 7:06 PM > > To: "user@hadoop.apache.org" > Subject: Re: Using YARN with native applications > > Thanks for the tip. In the trunk it looks like the NodeManager's > monitor thread doesn't care if the process tree's cores overflows the > container's CPU limit. Is this monitored elsewhere? > > I have my eyes on > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-ya= rn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apac= he/hadoop/yarn/server/nodemanager/containermanager/monitor/ContainersMonito= rImpl.java#L476 > > > On Wed, May 27, 2015 at 9:06 AM Varun Vasudev > wrote: > >> You should also look at ProcfsBasedProcessTree if you want to know how >> exactly the memory usage is being calculated. >> >> -Varun >> >> From: Kevin >> Reply-To: "user@hadoop.apache.org" >> Date: Wednesday, May 27, 2015 at 6:22 PM >> >> To: "user@hadoop.apache.org" >> Subject: Re: Using YARN with native applications >> >> Varun, thank you for helping me understand this. You pointed out a >> couple of new things to me. I finally found that monitoring thread in th= e >> code (ContainersMonitorImpl.java). I can now see and gain a better >> understanding of YARN checks on a container's resources. >> >> On Wed, May 27, 2015 at 1:23 AM Varun Vasudev >> wrote: >> >>> YARN should kill the container. I=E2=80=99m not sure what JVM you=E2= =80=99re referring >>> to, but the NodeManager writes and then spawns a shell script that will >>> invoke your shell script which in turn(presumably) will invoke your C++ >>> application. A monitoring thread then looks at the memory usage of the >>> process tree and compares it to the limits for the container. >>> >>> -Varun >>> >>> From: Kevin >>> Reply-To: "user@hadoop.apache.org" >>> Date: Tuesday, May 26, 2015 at 7:22 AM >>> To: "user@hadoop.apache.org" >>> Subject: Re: Using YARN with native applications >>> >>> Thanks for the reply, Varun. So if I use the DefaultContainerExecutor >>> and run a C++ application via a shell script inside a container whose >>> virtual memory limit is, for example, 2 GB, and that application does a >>> malloc for 3 GB, YARN will kill the container? I always just thought th= at >>> YARN kept its eye on the JVM it spins up for the container (under the >>> DefaultContainerExecutor). >>> >>> -Kevin >>> >>> On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev >> > wrote: >>> >>>> Hi Kevin, >>>> >>>> By default, the NodeManager monitors physical and virtual memory >>>> usage of containers. Containers that exceed either limit are killed. A= dmins >>>> can disable the checks by setting yarn.nodemanager.pmem-check-enabled >>>> and/or yarn.nodemanager.vmem-check-enabled to false. The virtual >>>> memory limit for a container is determined using the config variable y= arn.nodemanager.vmem-pmem-ratio(default >>>> value is 2.1). >>>> >>>> In case of vcores - >>>> >>>> 1. If you=E2=80=99re using Cgroups under LinuxContainerExecutor, by >>>> default, if there is spare CPU available on the node, your containe= r will >>>> be allowed to use it. Admins can restrict containers to use only th= e CPU >>>> allocated to them by setting yarn.nodemanager.linux-container-execu= tor.cgroups.strict-resource-usage >>>> to true. This setting is only applicable when using Cgroups under >>>> LinuxContainerExecutor. >>>> 2. If you aren=E2=80=99t using Cgroups under LinuxContainerExecuto= r, there >>>> is no limiting of the amount of the CPU that containers can use. >>>> >>>> -Varun >>>> >>>> From: Kevin >>>> Reply-To: "user@hadoop.apache.org" >>>> Date: Friday, May 22, 2015 at 3:30 AM >>>> To: "user@hadoop.apache.org" >>>> Subject: Using YARN with native applications >>>> >>>> Hello, >>>> >>>> I have been using the distributed shell application and Oozie to run >>>> native C++ applications in the cluster. Is YARN able to see the resour= ces >>>> these native applications use. For example, if I use Oozie's shell act= ion, >>>> the NodeManager hosts the mapper container and allocates a certain amo= unt >>>> of memory and vcores (as configured). What happens if my C++ applicati= on >>>> uses more memory or vcores than the NodeManager allocated? >>>> >>>> I was looking in the Hadoop code and I couldn't find my way to >>>> answer. Although, it seems the LinuxContainerExecutor may be the answe= r to >>>> my question since it uses cgroups. >>>> >>>> I'm interested to know how YARN reacts to non-Java applications >>>> running inside of it. >>>> >>>> Thanks, >>>> Kevin >>>> >>> >>> --047d7b10ce4757148c051710cf54 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Ah, okay. That makes sense. Thanks for all your help, Varu= n.

-Kevin

On We= d, May 27, 2015 at 9:53 AM Varun Vasudev <vvasudev@hortonworks.com> wrote:
For CPU isolation, you have to use Cgroups with the LinuxContainerExec= utor. We don=E2=80=99t enforce cpu limits with the DefaultContainerExecutor= .

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org" Date: Wednesday, May 27, 2015 at 7:= 06 PM

To: "user@hadoop.apache.org"
Subject: Re: Using YARN with native= applications

Thanks for the tip. In the trunk it looks like the NodeMan= ager's monitor thread doesn't care if the process tree's cores = overflows the container's CPU limit. Is this monitored elsewhere?



On Wed, May 27, 2015 at 9:06 AM Varun Vasudev <vvasudev@hortonworks= .com> wrote:
You should also look at=C2=A0ProcfsBasedProcessTree=C2=A0if yo= u want to know how exactly the memory usage is being calculated.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org" Date: Wednesday, May 27, 2015 at 6:= 22 PM

To: "user@hadoop.apache.org"
Subject: Re: Using YARN with native= applications

Varun, thank you for helping me understand this. You point= ed out a couple of new things to me. I finally found that monitoring thread= in the code (ContainersMonitorImpl.java). I can now see and gain a better = understanding of YARN checks on a container's resources.

On Wed, May 27, 2015 at 1:23 AM Varun Vasudev <vvasudev@hortonworks= .com> wrote:
YARN should kill the container. I=E2=80=99m not sure what JVM you=E2= =80=99re referring to, but the NodeManager writes and then spawns a shell s= cript that will invoke your shell script which in turn(presumably) will inv= oke your C++ application. A monitoring thread then looks at the memory usage of the process tree and compares it to the limit= s for the container.

-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org" Date: Tuesday, May 26, 2015 at 7:22= AM
To: "user@hadoop.apache.org"
Subject: Re: Using YARN with native= applications

Thanks for the reply, Varun. So if I use the DefaultContai= nerExecutor and run a C++ application via a shell script inside a container= whose virtual memory limit is, for example, 2 GB, and that application doe= s a malloc for 3 GB, YARN will kill the container? I always just thought that YARN kept its eye on the JVM it = spins up for the container (under the DefaultContainerExecutor).

-Kevin

On Mon, May 25, 2015 at 4:17 AM, Varun Vasudev <= span dir=3D"ltr"> <vvasudev@= hortonworks.com> wrote:
Hi Kevin,

By default, the NodeManager monitors physical and virtual memory usage= of containers. Containers that exceed either limit are killed. Admins can = disable the checks by setting=C2=A0yarn.nodemanager.pmem-check-enabled and/or=C2=A0yarn.nodemanager.vmem-check-enabled to false. The virtual memory = limit for a container is determined using the config variable=C2=A0yarn.nodemanag= er.vmem-pmem-ratio(default value is 2.1).

In case of vcores -
  1. If you=E2=80=99re using Cgroups under LinuxContainerExecutor, by defaul= t, if there is spare CPU available on the node, your container will be allo= wed to use it. Admins can restrict containers to use only the CPU allocated= to them by setting=C2=A0yarn.nodemanager.linu= x-container-executor.cgroups.strict-resource-usage to true. This setting is only applicable when using Cgroups under LinuxCon= tainerExecutor.
  2. =C2=A0If you aren= =E2=80=99t using Cgroups under LinuxContainerExecutor, there is no limiting= of the amount of the CPU that containers can use.
-Varun

From: Kevin
Reply-To: "user@hadoop.apache.org" Date: Friday, May 22, 2015 at 3:30 = AM
To: "user@hadoop.apache.org"
Subject: Using YARN with native app= lications

Hello,

I have been using the distributed shell application and Oozie to run n= ative C++ applications in the cluster. Is YARN able to see the resources th= ese native applications use. For example, if I use Oozie's shell action= , the NodeManager hosts the mapper container and allocates a certain amount of memory and vcores (as configured). What = happens if my C++ application uses more memory or vcores than the NodeManag= er allocated?

I was looking in the Hadoop code and I couldn't find my way to ans= wer. Although, it seems the LinuxContainerExecutor may be the answer to my = question since it uses cgroups.

I'm interested to know how YARN reacts to non-Java applications ru= nning inside of it.

Thanks,
Kevin

--047d7b10ce4757148c051710cf54--