Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5BE36189F7 for ; Mon, 22 Jun 2015 22:52:03 +0000 (UTC) Received: (qmail 65696 invoked by uid 500); 22 Jun 2015 22:51:57 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 65546 invoked by uid 500); 22 Jun 2015 22:51:57 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 65535 invoked by uid 99); 22 Jun 2015 22:51:57 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 22 Jun 2015 22:51:57 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id ABB1ECF86B for ; Mon, 22 Jun 2015 22:51:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.15 X-Spam-Level: **** X-Spam-Status: No, score=4.15 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_REPLY=1, HTML_MESSAGE=3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id lC4Yz-70s-SV for ; Mon, 22 Jun 2015 22:51:42 +0000 (UTC) Received: from mail-lb0-f175.google.com (mail-lb0-f175.google.com [209.85.217.175]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 8490F4540D for ; Mon, 22 Jun 2015 22:51:41 +0000 (UTC) Received: by lbnk3 with SMTP id k3so11722583lbn.1 for ; Mon, 22 Jun 2015 15:51:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=SmTbxiMOGzpDzV2ykDcTbDV+o6YOYTVJVhhqivUymJA=; b=LEy+/BpfB32JdG5e8+1BprVJTEYL37SNyAU/ZCMBETeOzYBfUHVT/Vi5q7eXA6L41V ETIjbqjwfIQdeQA3i8ppdAq7sl1Ir5vARC8sFVuFlfmvI8dnesl7Dd3cs0pPd+MES+Na +k5gGLfqzqtVU5gO12mM8lMJxNc1PzaXPhLcNiSrQn0xsy0iEuwUXZ4KpmumfzNwU0+L TMu642qcOd4ENCsq+qFXXdWXjnPsYr5WYVIYlBudtifHGZEauyS6yKU0jg63M5l8xz0P iBg5pIhufqAOMCfAkDYC5evI+Z4pvDYB+RdRUNeu+YgrDY6/Qf+xn84pxDZOn3MwKSqC bnBA== MIME-Version: 1.0 X-Received: by 10.152.87.13 with SMTP id t13mr32679908laz.66.1435013500460; Mon, 22 Jun 2015 15:51:40 -0700 (PDT) Received: by 10.25.20.170 with HTTP; Mon, 22 Jun 2015 15:51:40 -0700 (PDT) In-Reply-To: References: Date: Mon, 22 Jun 2015 15:51:40 -0700 Message-ID: Subject: Re: YARN container killed as running beyond memory limits From: Gaurav Gupta To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11c353e6f166b60519231de4 --001a11c353e6f166b60519231de4 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable You can also change the default value of yarn.nodemanager.vmem-pmem-ratio On Sat, Jun 20, 2015 at 12:39 AM, Drake=EB=AF=BC=EC=98=81=EA=B7=BC wrote: > Hi, > > You should disable vmem check. See this: > http://blog.cloudera.com/blog/2014/04/apache-hadoop-yarn-avoiding-6-time-= consuming-gotchas/ > > > Thanks. > > 2015=EB=85=84 6=EC=9B=94 17=EC=9D=BC =EC=88=98=EC=9A=94=EC=9D=BC, Naganar= asimha G R (Naga)=EB=8B=98=EC=9D=B4 > =EC=9E=91=EC=84=B1=ED=95=9C =EB=A9=94=EC=8B=9C=EC=A7=80: > > Hi, >> From the logs its pretty clear its due to >> *"Current usage: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB >> virtual memory used. Killing container."* >> Please increase the value yarn.nodemanager.vmem-pmem-ratio from the >> default value 2 to something like 4 or 8 based on ur app and system. >> >> + Naga >> ------------------------------ >> *From:* Arbi Akhina [arbi.akhina@gmail.com] >> *Sent:* Wednesday, June 17, 2015 17:19 >> *To:* user@hadoop.apache.org >> *Subject:* YARN container killed as running beyond memory limits >> >> Hi, I've a YARN application that submits containers. In the >> AplicationMaster logs I see that the container is killed. Here is the lo= gs: >> >> Jun 17, 2015 1:31:27 PM com.heavenize.modules.RMCallbackHandler onConta= inersCompleted >> INFO: container 'container_1434471275225_0007_01_000002' status is Conta= inerStatus: [ContainerId: container_1434471275225_0007_01_000002, State: CO= MPLETE, Diagnostics: Container [pid=3D4069,containerID=3Dcontainer_14344712= 75225_0007_01_000002] is running beyond virtual memory limits. Current usag= e: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB virtual memory u= sed. Killing container. >> Dump of the process-tree for container_1434471275225_0007_01_000002 : >> |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(M= ILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE >> |- 4094 4093 4069 4069 (java) 2932 94 2916065280 122804 /usr/lib/jvm/ja= va-7-openjdk-amd64/bin/java -Xms512m -Xmx2048m -XX:MaxPermSize=3D250m -XX:+= UseConcMarkSweepGC -Dosmoze.path=3D/tmp/hadoop-hadoop/nm-local-dir/usercach= e/hadoop/appcache/container_1434471275225_0007_01_000002/Osmoze -Dspring.pr= ofiles.active=3DwebServer -jar /tmp/hadoop-hadoop/nm-local-dir/usercache/ha= doop/appcache/container_1434471275225_0007_01_000002/heavenize-modules.jar >> |- 4093 4073 4069 4069 (sh) 0 0 4550656 164 /bin/sh /tmp/hadoop-hadoop/= nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_0000= 02/startup.sh >> |- 4073 4069 4069 4069 (java) 249 34 1577267200 24239 /usr/lib/jvm/java= -7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -container= Id container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-ser= ver/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/= user/hadoop/heavenize/config.zip >> |- 4069 1884 4069 4069 (bash) 0 0 12730368 304 /bin/bash -c /usr/lib/jv= m/java-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -con= tainerId container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hado= op-server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-s= erver/user/hadoop/heavenize/config.zip 1> /usr/local/hadoop/logs/userlogs/a= pplication_1434471275225_0007/container_1434471275225_0007_01_000002/stdout= 2> /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/containe= r_1434471275225_0007_01_000002/stderr >> >> >> I don't see any memory excess, any idea where this error comes from? >> There is no errors in the container, it just stop logging as a result >> of being killed. >> > > > -- > Drake =EB=AF=BC=EC=98=81=EA=B7=BC Ph.D > kt NexR > > --001a11c353e6f166b60519231de4 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
You can also change the default value of=C2=A0yarn.nodeman= ager.vmem-pmem-ratio=C2=A0

On Sat, Jun 20, 2015 at 12:39 AM, Drake=EB=AF=BC=EC=98=81=EA=B7=BC= <drake.min@nexr.com> wrote:
Hi,


Thanks.=C2=A0

2015=EB=85=84 6=EC=9B=94 = 17=EC=9D=BC =EC=88=98=EC=9A=94=EC=9D=BC, Naganarasimha G R (Naga)<garlanaganaras= imha@huawei.com>=EB=8B=98=EC=9D=B4 =EC=9E=91=EC=84=B1=ED=95=9C =EB= =A9=94=EC=8B=9C=EC=A7=80:

Hi,
=C2=A0=C2=A0 From the logs its pretty clear its due to=C2=A0<= /div>
"Current usage: 576.2 MB of 2 G= B physical memory used; 4.2 GB of 4.2 GB virtual memory used. Killing conta= iner."

+ Naga

From: Arbi Akhina= [arbi.akhina@gmail.com]
Sent: Wednesday, June 17, 2015 17:19
To: user@hadoop.apache.org
Subject: YARN container killed as running beyond memory limits

Hi, I've a YARN application that submits containers. I= n the AplicationMaster logs I see that the container is killed. Here is the= logs:

Jun 17, 2015 1:3=
1:27 PM com.heavenize.modules.RMCallbackHandler onContainersCompleted
INFO: container 'container_1434471275225_0007_01_000002' status is =
ContainerStatus: [ContainerId: container_1434471275225_0007_01_000002, Stat=
e: COMPLETE, Diagnostics: Container [pid=3D4069,containerID=3Dcontainer_143=
4471275225_0007_01_000002] is running beyond virtual memory limits. Current=
 usage: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB virtual mem=
ory used. Killing container.
Dump of the process-tree for container_1434471275225_0007_01_000002 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILL=
IS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 4094 4093 4069 4069 (java) 2932 94 2916065280 122804 /usr/lib/jvm/java-=
7-openjdk-amd64/bin/java -Xms512m -Xmx2048m -XX:MaxPermSize=3D250m -XX:+Use=
ConcMarkSweepGC -Dosmoze.path=3D/tmp/hadoop-hadoop/nm-local-dir/usercache/h=
adoop/appcache/container_1434471275225_0007_01_000002/Osmoze -Dspring.profi=
les.active=3DwebServer -jar /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoo=
p/appcache/container_1434471275225_0007_01_000002/heavenize-modules.jar=20
	|- 4093 4073 4069 4069 (sh) 0 0 4550656 164 /bin/sh /tmp/hadoop-hadoop/nm-=
local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_000002/=
startup.sh=20
	|- 4073 4069 4069 4069 (java) 249 34 1577267200 24239 /usr/lib/jvm/java-7-=
openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -containerId =
container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-server=
/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/use=
r/hadoop/heavenize/config.zip=20
	|- 4069 1884 4069 4069 (bash) 0 0 12730368 304 /bin/bash -c /usr/lib/jvm/j=
ava-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -contai=
nerId container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-=
server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-serv=
er/user/hadoop/heavenize/config.zip 1> /usr/local/hadoop/logs/userlogs/a=
pplication_1434471275225_0007/container_1434471275225_0007_01_000002/stdout=
 2> /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/conta=
iner_1434471275225_0007_01_000002/stderr=20

I don't see any memory excess, any idea where this error comes fro= m?
There is no errors in the container, it just stop logging as a result = of being killed.


--
Drake =EB=AF=BC=EC= =98=81=EA=B7=BC Ph.D
kt NexR


--001a11c353e6f166b60519231de4--