Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EC0E3F893 for ; Fri, 22 Mar 2013 10:27:28 +0000 (UTC) Received: (qmail 26209 invoked by uid 500); 22 Mar 2013 10:27:24 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 26089 invoked by uid 500); 22 Mar 2013 10:27:23 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 26053 invoked by uid 99); 22 Mar 2013 10:27:22 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Mar 2013 10:27:22 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of write2kishore@gmail.com designates 209.85.223.169 as permitted sender) Received: from [209.85.223.169] (HELO mail-ie0-f169.google.com) (209.85.223.169) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Mar 2013 10:27:15 +0000 Received: by mail-ie0-f169.google.com with SMTP id qd14so3178889ieb.0 for ; Fri, 22 Mar 2013 03:26:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=KYTit9QA0QdcNTAS2tBdRcLeGyWrGg3IFy0tHzF2s9s=; b=nF7VCUUrVdSm1qjAbPF672+1An0U9j5P1xFSFrIdLxf/ihfOk70lZ0Jd84OdXMCw33 r5yqrtlinHARXTzi6T5hVgiiHvu2dr7yG1yzC4ZoIuQH98ObUfzgS8aiAsafPoiQmvgJ J2KtEnf9dvuoiW3hQ75Md8rKGVj6lceiKmQUXT+8saOEDjtfcvWNa/HA4ZlOLPVHJO7d xeDcSwFBaOhrRqu1vkke8SKgtlUFh22GuwUmuOF0w1V8iMKbPC0aDnjtXjEUm6rY37te XwJgFqo46gFYRoqFBAQzbqCWGkc65t+V+je0Jw10dU5+SPG6fzJP7wJu9wPKQygEuP1A s1rQ== MIME-Version: 1.0 X-Received: by 10.50.185.137 with SMTP id fc9mr732392igc.109.1363948015039; Fri, 22 Mar 2013 03:26:55 -0700 (PDT) Received: by 10.42.68.7 with HTTP; Fri, 22 Mar 2013 03:26:54 -0700 (PDT) Date: Fri, 22 Mar 2013 15:56:54 +0530 Message-ID: Subject: Application Master getting killed randomly reporting excess usage of memory From: Krishna Kishore Bonagiri To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=14dae9340bffedbc4e04d880e48a X-Virus-Checked: Checked by ClamAV on apache.org --14dae9340bffedbc4e04d880e48a Content-Type: text/plain; charset=ISO-8859-1 Hi, I am running a date command using the Distributed Shell example in a loop of 500 times. It ran successfully all the times except one time where it gave the following error. 2013-03-22 04:33:25,280 INFO [main] distributedshell.Client (Client.java:monitorApplication(605)) - Got application report from ASM for, appId=222, clientToken=null, appDiagnostics=Application application_1363938200742_0222 failed 1 times due to AM Container for appattempt_1363938200742_0222_000001 exited with exitCode: 143 due to: Container [pid=21141,containerID=container_1363938200742_0222_01_000001] is running beyond virtual memory limits. Current usage: 47.3 Mb of 128 Mb physical memory used; 611.6 Mb of 268.8 Mb virtual memory used. Killing container. Dump of the process-tree for container_1363938200742_0222_01_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 21147 21141 21141 21141 (java) 244 12 532643840 11802 /home_/dsadm/yarn/jdk//bin/java -Xmx128m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --num_containers 2 --priority 0 --shell_command date |- 21141 8433 21141 21141 (bash) 0 0 108642304 298 /bin/bash -c /home_/dsadm/yarn/jdk//bin/java -Xmx128m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster --container_memory 10 --num_containers 2 --priority 0 --shell_command date 1>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_000001/AppMaster.stdout 2>/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_01_000001/AppMaster.stderr Any ideas if it is a known issue? I am using the latest version of hadoop, i.e. hadoop-2.0.3-alpha. Thanks, Kishore --14dae9340bffedbc4e04d880e48a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,

=A0 I am running a date command usi= ng the Distributed Shell example in a loop of 500 times. It ran successfull= y all the times except one time where it gave the following error.

2013-03-22 04:33:25,280 INFO =A0[main] distribute= dshell.Client (Client.java:monitorApplication(605)) - Got application repor= t from ASM for, appId=3D222, clientToken=3Dnull, appDiagnostics=3DApplicati= on application_1363938200742_0222 failed 1 times due to AM Container for ap= pattempt_1363938200742_0222_000001 exited with =A0exitCode: 143 due to: Con= tainer [pid=3D21141,containerID=3Dcontainer_1363938200742_0222_01_000001] i= s running beyond virtual memory limits. Current usage: 47.3 Mb of 128 Mb ph= ysical memory used; 611.6 Mb of 268.8 Mb virtual memory used. Killing conta= iner.
Dump of the process-tree for container_1363938200742_0222_01_000001 :<= /div>
=A0 =A0 =A0 =A0 |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME= (MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD= _LINE
=A0 =A0 =A0 =A0 |- 21147 21141 21141 21141 (java) 244 12 532643840 118= 02 /home_/dsadm/yarn/jdk//bin/java -Xmx128m org.apache.hadoop.yarn.applicat= ions.distributedshell.ApplicationMaster --container_memory 10 --num_contain= ers 2 --priority 0 --shell_command date
=A0 =A0 =A0 =A0 |- 21141 8433 21141 21141 (bash) 0 0 108642304 298 /bi= n/bash -c /home_/dsadm/yarn/jdk//bin/java -Xmx128m org.apache.hadoop.yarn.a= pplications.distributedshell.ApplicationMaster --container_memory 10 --num_= containers 2 --priority 0 --shell_command date 1>/tmp/logs/application_1= 363938200742_0222/container_1363938200742_0222_01_000001/AppMaster.stdout 2= >/tmp/logs/application_1363938200742_0222/container_1363938200742_0222_0= 1_000001/AppMaster.stderr


=A0 Any ideas if it is a known iss= ue? I am using the latest version of hadoop, i.e. hadoop-2.0.3-alpha.
=

Thanks,
Kishore
--14dae9340bffedbc4e04d880e48a--