Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 796AB18600 for ; Wed, 17 Jun 2015 11:50:05 +0000 (UTC) Received: (qmail 59096 invoked by uid 500); 17 Jun 2015 11:49:56 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 58970 invoked by uid 500); 17 Jun 2015 11:49:56 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 58960 invoked by uid 99); 17 Jun 2015 11:49:56 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Jun 2015 11:49:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id F0DFDC0953 for ; Wed, 17 Jun 2015 11:49:55 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.792 X-Spam-Level: * X-Spam-Status: No, score=1.792 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H2=-1.108, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id OesCt-cC_acM for ; Wed, 17 Jun 2015 11:49:49 +0000 (UTC) Received: from mail-qk0-f175.google.com (mail-qk0-f175.google.com [209.85.220.175]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id 5C53E43A5B for ; Wed, 17 Jun 2015 11:49:49 +0000 (UTC) Received: by qkfe185 with SMTP id e185so24458094qkf.3 for ; Wed, 17 Jun 2015 04:49:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=3CUMEp2nStcMOmWpjG2dIVmr922HAF96NjNsK52BX5E=; b=rVKCA36C5Vdfe1RKS2imLaqqvVijnBiH/iyO+TiNn5PuxcA7EDj9/tzgPpfRxxKv5l bE4BLE1lLx8yCyOl8aYs8ljomP52LsI/OVp83puWk2KBP4epZjFK2gQxlM6Z7Tlfar9C W2uNpc8Y24FolTIPm7nrUGcN6mtHeXAq09bTpRi9A6cvEAcsQM5N4BQSQ8jJ+ZN2OnqV jVFG95Q4LVKz86vJ8hBS7P1Y3wMhx9h/9h4RsUwx5SeiyjTttvemMGwC1yffjAXOKVFE sLUjHY8tf5wmfdgwIDM7MncwlAlyPO311VG1bEWkKtkSgMlXUQAQ3RlLtj7ftGw+TDef eE4g== MIME-Version: 1.0 X-Received: by 10.55.15.232 with SMTP id 101mr11887679qkp.69.1434541789176; Wed, 17 Jun 2015 04:49:49 -0700 (PDT) Received: by 10.140.87.119 with HTTP; Wed, 17 Jun 2015 04:49:49 -0700 (PDT) Date: Wed, 17 Jun 2015 13:49:49 +0200 Message-ID: Subject: YARN container killed as running beyond memory limits From: Arbi Akhina To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11475d4cc26aa70518b54932 --001a11475d4cc26aa70518b54932 Content-Type: text/plain; charset=UTF-8 Hi, I've a YARN application that submits containers. In the AplicationMaster logs I see that the container is killed. Here is the logs: Jun 17, 2015 1:31:27 PM com.heavenize.modules.RMCallbackHandler onContainersCompleted INFO: container 'container_1434471275225_0007_01_000002' status is ContainerStatus: [ContainerId: container_1434471275225_0007_01_000002, State: COMPLETE, Diagnostics: Container [pid=4069,containerID=container_1434471275225_0007_01_000002] is running beyond virtual memory limits. Current usage: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB virtual memory used. Killing container. Dump of the process-tree for container_1434471275225_0007_01_000002 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 4094 4093 4069 4069 (java) 2932 94 2916065280 122804 /usr/lib/jvm/java-7-openjdk-amd64/bin/java -Xms512m -Xmx2048m -XX:MaxPermSize=250m -XX:+UseConcMarkSweepGC -Dosmoze.path=/tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_000002/Osmoze -Dspring.profiles.active=webServer -jar /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_000002/heavenize-modules.jar |- 4093 4073 4069 4069 (sh) 0 0 4550656 164 /bin/sh /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_000002/startup.sh |- 4073 4069 4069 4069 (java) 249 34 1577267200 24239 /usr/lib/jvm/java-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -containerId container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/user/hadoop/heavenize/config.zip |- 4069 1884 4069 4069 (bash) 0 0 12730368 304 /bin/bash -c /usr/lib/jvm/java-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -containerId container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/user/hadoop/heavenize/config.zip 1> /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/container_1434471275225_0007_01_000002/stdout 2> /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/container_1434471275225_0007_01_000002/stderr I don't see any memory excess, any idea where this error comes from? There is no errors in the container, it just stop logging as a result of being killed. --001a11475d4cc26aa70518b54932 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi, I've a YARN application that submits containers. I= n the AplicationMaster logs I see that the container is killed. Here is the= logs:

Jun 17, 2015 1:31:27 PM com.heavenize.modules.RMCallbackHandler onConta=
inersCompleted
INFO: container 'container_1434471275225_0007_01_000002' status is =
ContainerStatus: [ContainerId: container_1434471275225_0007_01_000002, Stat=
e: COMPLETE, Diagnostics: Container [pid=3D4069,containerID=3Dcontainer_143=
4471275225_0007_01_000002] is running beyond virtual memory limits. Current=
 usage: 576.2 MB of 2 GB physical memory used; 4.2 GB of 4.2 GB virtual mem=
ory used. Killing container.
Dump of the process-tree for container_1434471275225_0007_01_000002 :
	|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILL=
IS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
	|- 4094 4093 4069 4069 (java) 2932 94 2916065280 122804 /usr/lib/jvm/java-=
7-openjdk-amd64/bin/java -Xms512m -Xmx2048m -XX:MaxPermSize=3D250m -XX:+Use=
ConcMarkSweepGC -Dosmoze.path=3D/tmp/hadoop-hadoop/nm-local-dir/usercache/h=
adoop/appcache/container_1434471275225_0007_01_000002/Osmoze -Dspring.profi=
les.active=3DwebServer -jar /tmp/hadoop-hadoop/nm-local-dir/usercache/hadoo=
p/appcache/container_1434471275225_0007_01_000002/heavenize-modules.jar=20
	|- 4093 4073 4069 4069 (sh) 0 0 4550656 164 /bin/sh /tmp/hadoop-hadoop/nm-=
local-dir/usercache/hadoop/appcache/container_1434471275225_0007_01_000002/=
startup.sh=20
	|- 4073 4069 4069 4069 (java) 249 34 1577267200 24239 /usr/lib/jvm/java-7-=
openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -containerId =
container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-server=
/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-server/use=
r/hadoop/heavenize/config.zip=20
	|- 4069 1884 4069 4069 (bash) 0 0 12730368 304 /bin/bash -c /usr/lib/jvm/j=
ava-7-openjdk-amd64/bin/java com.heavenize.yarn.task.ModulesManager -contai=
nerId container_1434471275225_0007_01_000002 -port 5369 -exe hdfs://hadoop-=
server/user/hadoop/heavenize/heavenize-modules.jar -conf hdfs://hadoop-serv=
er/user/hadoop/heavenize/config.zip 1> /usr/local/hadoop/logs/userlogs/a=
pplication_1434471275225_0007/container_1434471275225_0007_01_000002/stdout=
 2> /usr/local/hadoop/logs/userlogs/application_1434471275225_0007/conta=
iner_1434471275225_0007_01_000002/stderr=20

I don't see any memory excess, any idea= where this error comes from?
There is no errors in the con= tainer, it just stop logging as a result of being killed.
--001a11475d4cc26aa70518b54932--