Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 63B1610886 for ; Thu, 27 Mar 2014 00:39:11 +0000 (UTC) Received: (qmail 76750 invoked by uid 500); 27 Mar 2014 00:39:00 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 76649 invoked by uid 500); 27 Mar 2014 00:38:51 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 76582 invoked by uid 99); 27 Mar 2014 00:38:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Mar 2014 00:38:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,WEIRD_QUOTING X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of wheeleast@gmail.com designates 209.85.212.181 as permitted sender) Received: from [209.85.212.181] (HELO mail-wi0-f181.google.com) (209.85.212.181) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Mar 2014 00:38:42 +0000 Received: by mail-wi0-f181.google.com with SMTP id hm4so2543119wib.2 for ; Wed, 26 Mar 2014 17:38:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=Sy80zI1GZQXGIrbtmbgxeq9s+fA29XgXxVRZND7l3Jk=; b=pS+WNVLyR14+FrF59STGlrUWjh85YzRYTTN5UpzOLKBFDYbAnu44WZLb0I8q2KC5/l TfABk2CErMWdB2xl7hCyYdJU4oieL6NrA/1lqt68d8LN8w7UtHBTvfNQFASwqwhcnpGC uXz1aWEGKibGQ9BM5Da4RDUrf32qq0cKxRpA0F0EtI/pRdXBxHuTtD0opqY14F/KbzLU /wk+Rr9s8jeENaSBj/kPJjVyPpDECXI5swEfZK+jaREmMQIgr1xKd/rFRZGRvQPvnQZ+ UQOLfPMturoThiH+RMiqGUqm4ZgO0A3LJTGbpqM73wRjrVxl7HKbvRH9dM3yeGjMJwR+ Wg0w== X-Received: by 10.180.10.66 with SMTP id g2mr9098183wib.5.1395880690772; Wed, 26 Mar 2014 17:38:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.127.69 with HTTP; Wed, 26 Mar 2014 17:34:42 -0700 (PDT) In-Reply-To: <8671AF14-69AD-432E-92A1-1DD903471CF4@gmail.com> References: <869970D71E26D7498BDAC4E1CA92226B86EA8BCB@MBX021-E3-NJ-2.exch021.domain.local> <869970D71E26D7498BDAC4E1CA92226B86EA8C15@MBX021-E3-NJ-2.exch021.domain.local> <869970D71E26D7498BDAC4E1CA92226B86EA8C3E@MBX021-E3-NJ-2.exch021.domain.local> <8671AF14-69AD-432E-92A1-1DD903471CF4@gmail.com> From: Wangda Tan Date: Thu, 27 Mar 2014 08:34:42 +0800 Message-ID: Subject: Re: Getting error message from AM container launch To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11c25d58b8f4cb04f58bcc65 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c25d58b8f4cb04f58bcc65 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable HI John, Typically, this is caused by somewhere in your program set "nice" as AM launching command. You can check the "real" script which YARN used to launch AM. You need set "yarn.nodemanager.delete.debug-delay-sec" in yarn-site.xml on all NMs to a larger value (like 600, 10 min), to make NMs don't remove temporary directory of a container as soon as the container get finished. You need restart NMs after you set. After that, you can re-run your program again, the script you can find should be :/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache///launch_container.sh. You can verify the launch command if correct in the script. -- Regards, Wangda Tan On Thu, Mar 27, 2014 at 7:12 AM, Azuryy wrote: > You used 'nice' in your app? > > > Sent from my iPhone5s > > On 2014=E5=B9=B43=E6=9C=8827=E6=97=A5, at 6:55, John Lilley wrote: > > On further examination they appear to be 369 characters long. I=E2=80= =99ve read > about similar issues showing when the environment exceeds 132KB, but we > aren=E2=80=99t putting anything significant in the environment. > > John > > > > > > *From:* John Lilley [mailto:john.lilley@redpoint.net] > > *Sent:* Wednesday, March 26, 2014 4:41 PM > *To:* user@hadoop.apache.org > *Subject:* RE: Getting error message from AM container launch > > > > We do have a fairly long container command-line. Not huge, around 200 > characters. > > John > > > > *From:* John Lilley [mailto:john.lilley@redpoint.net] > > *Sent:* Wednesday, March 26, 2014 4:38 PM > *To:* user@hadoop.apache.org > *Subject:* Getting error message from AM container launch > > > > Running a non-MapReduce YARN application, one of the containers launched > by the AM is failing with an error message I=E2=80=99ve never seen. Any = ideas? > I=E2=80=99m not sure who exactly is running =E2=80=9Cnice=E2=80=9D or why= its argument list would > be too long. > > Thanks > > john > > > > Container for appattempt_1395755163053_0030_000001 exited with exitCode: > 0 due to: Exception from container-launch: > > java.io.IOException: Cannot run program ""nice"" (in directory > ""/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/application_13= 95755163053_0030/container_1395755163053_0030_01_000001""): > java.io.IOException: error=3D7, Argument list too long > > at java.lang.ProcessBuilder.start(ProcessBuilder.java:460= ) > > at org.apache.hadoop.util.Shell.runCommand(Shell.java:407= ) > > at org.apache.hadoop.util.Shell.run(Shell.java:379) > > at > org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589) > > at > org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launch= Container(DefaultContainerExecutor.java:195) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Conta= inerLaunch.call(ContainerLaunch.java:283) > > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.Conta= inerLaunch.call(ContainerLaunch.java:79) > > at > java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:13= 8) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor= .java:886) > > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav= a:908) > > at java.lang.Thread.run(Thread.java:662) > > Caused by: java.io.IOException: java.io.IOException: error=3D7, Argument > list too long > > at java.lang.UNIXProcess.(UNIXProcess.java:148) > > at java.lang.ProcessImpl.start(ProcessImpl.java:65) > > at java.lang.ProcessBuilder.start(ProcessBuilder.java:453= ) > > ... 11 more > > > > --=20 Regards, Wangda --001a11c25d58b8f4cb04f58bcc65 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
HI John,
Typically, this is caused by somewhere in you= r program set "nice" as AM launching command. You can check the &= quot;real" script which YARN used to launch AM.
You need set= "yarn.nodemanager.delete.debug-delay-sec"=C2=A0in yarn-site.xml on all NMs to a larger value (like 600, 10 = min), to make NMs don't remove temporary directory of a container as so= on as the container get finished. You need restart NMs after you set.
After that, you can re-run your progra= m again, the script you can find should be <host-of-AM>:/ephemeral02/= hadoop/yarn/local/usercache/SYSTEM/appcache/<app-id>/<container-id= >/launch_container.sh.
You can verify the launch command if c= orrect in the script.
--
Regards,
<= span style=3D"white-space:pre">Wangda Tan


On Thu,= Mar 27, 2014 at 7:12 AM, Azuryy <azuryyyu@gmail.com> wrote= :
You used 'nice' in your app?

<= br>Sent from my iPhone5s

On 2014=E5=B9= =B43=E6=9C=8827=E6=97=A5, at 6:55, John Lilley <john.lilley@redpoint.net> wrot= e:

On further examination= they appear to be 369 characters long.=C2=A0 I=E2=80=99ve read about simil= ar issues showing when the environment exceeds 132KB, but we aren=E2=80=99t= putting anything significant in the environment.

John

=C2=A0

=C2=A0

From: John Lil= ley [mailto:j= ohn.lilley@redpoint.net]
Sent: Wednesday, March 26, 2014 4:41 PM
To: user= @hadoop.apache.org
Subject: RE: Getting error message from AM container launch

=C2=A0

We do have a fairly lo= ng container command-line.=C2=A0 Not huge, around 200 characters.=

John

=C2=A0

From: John Lil= ley [mailto:j= ohn.lilley@redpoint.net]
Sent: Wednesday, March 26, 2014 4:38 PM
To: user= @hadoop.apache.org
Subject: Getting error message from AM container launch

=C2=A0

Running a non-MapReduce YARN application, one of the= containers launched by the AM is failing with an error message I=E2=80=99v= e never seen.=C2=A0 Any ideas?=C2=A0 I=E2=80=99m not sure who exactly is ru= nning =E2=80=9Cnice=E2=80=9D or why its argument list would be too long.=

Thanks

john

=C2=A0

Container for appattempt_1395755163053_0030_000001 e= xited with=C2=A0 exitCode: 0 due to: Exception from container-launch:

java.io.IOException: Cannot run program ""= nice"" (in directory ""/ephemeral02/hadoop/yarn/local/u= sercache/SYSTEM/appcache/application_1395755163053_0030/container_139575516= 3053_0030_01_000001""): java.io.IOException: error=3D7, Argument = list too long

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.ProcessBuilder.start(P= rocessBuilder.java:460)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.util.Shell.run= Command(Shell.java:407)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.util.Shell.run= (Shell.java:379)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.util.Shell$She= llCommandExecutor.execute(Shell.java:589)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.yarn.server.no= demanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor= .java:195)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.yarn.server.no= demanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.ja= va:283)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at org.apache.hadoop.yarn.server.no= demanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.ja= va:79)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask$= Sync.innerRun(FutureTask.java:303)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.FutureTask.= run(FutureTask.java:138)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPoolE= xecutor$Worker.runTask(ThreadPoolExecutor.java:886)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.util.concurrent.ThreadPoolE= xecutor$Worker.run(ThreadPoolExecutor.java:908)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.Thread.run(Thread.java= :662)

Caused by: java.io.IOException: java.io.IOException:= error=3D7, Argument list too long

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.UNIXProcess.<init&g= t;(UNIXProcess.java:148)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.ProcessImpl.start(Proc= essImpl.java:65)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 at java.lang.ProcessBuilder.start(P= rocessBuilder.java:453)

=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 ... 11 more

=C2=A0




--
Regards,
Wangda
--001a11c25d58b8f4cb04f58bcc65--