hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wangda Tan <wheele...@gmail.com>
Subject Re: Getting error message from AM container launch
Date Thu, 27 Mar 2014 02:42:41 GMT
Glad to hear that :)
--
Wangda Tan

Regards,
Wangda Tan


On Thu, Mar 27, 2014 at 10:36 AM, John Lilley <john.lilley@redpoint.net>wrote:

>  Wangda Tan,
>
>
>
> Thanks for your reply!  We did actually figure out where the problem was
> coming from, but this is a very helpful technique to know.
>
>
>
> John
>
>
>
>
>
> *From:* Wangda Tan [mailto:wheeleast@gmail.com]
> *Sent:* Wednesday, March 26, 2014 6:35 PM
> *To:* user@hadoop.apache.org
> *Subject:* Re: Getting error message from AM container launch
>
>
>
> HI John,
>
> Typically, this is caused by somewhere in your program set "nice" as AM
> launching command. You can check the "real" script which YARN used to
> launch AM.
>
> You need set "yarn.nodemanager.delete.debug-delay-sec" in yarn-site.xml on
> all NMs to a larger value (like 600, 10 min), to make NMs don't remove
> temporary directory of a container as soon as the container get finished.
> You need restart NMs after you set.
>
> After that, you can re-run your program again, the script you can find
> should be
> <host-of-AM>:/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/<app-id>/<container-id>/launch_container.sh.
>
> You can verify the launch command if correct in the script.
>
> --
>
> Regards,
>
> Wangda Tan
>
>
>
> On Thu, Mar 27, 2014 at 7:12 AM, Azuryy <azuryyyu@gmail.com> wrote:
>
> You used 'nice' in your app?
>
>
>
> Sent from my iPhone5s
>
>
> On 2014年3月27日, at 6:55, John Lilley <john.lilley@redpoint.net> wrote:
>
>  On further examination they appear to be 369 characters long.  I’ve read
> about similar issues showing when the environment exceeds 132KB, but we
> aren’t putting anything significant in the environment.
>
> John
>
>
>
>
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net<john.lilley@redpoint.net>]
>
> *Sent:* Wednesday, March 26, 2014 4:41 PM
> *To:* user@hadoop.apache.org
> *Subject:* RE: Getting error message from AM container launch
>
>
>
> We do have a fairly long container command-line.  Not huge, around 200
> characters.
>
> John
>
>
>
> *From:* John Lilley [mailto:john.lilley@redpoint.net<john.lilley@redpoint.net>]
>
> *Sent:* Wednesday, March 26, 2014 4:38 PM
> *To:* user@hadoop.apache.org
> *Subject:* Getting error message from AM container launch
>
>
>
> Running a non-MapReduce YARN application, one of the containers launched
> by the AM is failing with an error message I’ve never seen.  Any ideas?
> I’m not sure who exactly is running “nice” or why its argument list would
> be too long.
>
> Thanks
>
> john
>
>
>
> Container for appattempt_1395755163053_0030_000001 exited with  exitCode:
> 0 due to: Exception from container-launch:
>
> java.io.IOException: Cannot run program ""nice"" (in directory
> ""/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/application_1395755163053_0030/container_1395755163053_0030_01_000001""):
> java.io.IOException: error=7, Argument list too long
>
>                 at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>
>                 at org.apache.hadoop.util.Shell.runCommand(Shell.java:407)
>
>                 at org.apache.hadoop.util.Shell.run(Shell.java:379)
>
>                 at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
>
>                 at
> org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
>
>                 at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
>
>                 at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
>
>                 at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>
>                 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>
>                 at java.lang.Thread.run(Thread.java:662)
>
> Caused by: java.io.IOException: java.io.IOException: error=7, Argument
> list too long
>
>                 at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
>
>                 at java.lang.ProcessImpl.start(ProcessImpl.java:65)
>
>                 at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
>
>                 ... 11 more
>
>
>
>
>
>
>
> --
>
> Regards,
>
> Wangda
>

Mime
View raw message