hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lilley <john.lil...@redpoint.net>
Subject RE: Getting error message from AM container launch
Date Thu, 27 Mar 2014 02:36:59 GMT
Wangda Tan,

Thanks for your reply!  We did actually figure out where the problem was coming from, but
this is a very helpful technique to know.


From: Wangda Tan [mailto:wheeleast@gmail.com]
Sent: Wednesday, March 26, 2014 6:35 PM
To: user@hadoop.apache.org
Subject: Re: Getting error message from AM container launch

HI John,
Typically, this is caused by somewhere in your program set "nice" as AM launching command.
You can check the "real" script which YARN used to launch AM.
You need set "yarn.nodemanager.delete.debug-delay-sec" in yarn-site.xml on all NMs to a larger
value (like 600, 10 min), to make NMs don't remove temporary directory of a container as soon
as the container get finished. You need restart NMs after you set.
After that, you can re-run your program again, the script you can find should be <host-of-AM>:/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/<app-id>/<container-id>/launch_container.sh.
You can verify the launch command if correct in the script.
Wangda Tan

On Thu, Mar 27, 2014 at 7:12 AM, Azuryy <azuryyyu@gmail.com<mailto:azuryyyu@gmail.com>>
You used 'nice' in your app?

Sent from my iPhone5s

On 2014年3月27日, at 6:55, John Lilley <john.lilley@redpoint.net<mailto:john.lilley@redpoint.net>>
On further examination they appear to be 369 characters long.  I’ve read about similar issues
showing when the environment exceeds 132KB, but we aren’t putting anything significant in
the environment.

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Wednesday, March 26, 2014 4:41 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: RE: Getting error message from AM container launch

We do have a fairly long container command-line.  Not huge, around 200 characters.

From: John Lilley [mailto:john.lilley@redpoint.net]
Sent: Wednesday, March 26, 2014 4:38 PM
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Getting error message from AM container launch

Running a non-MapReduce YARN application, one of the containers launched by the AM is failing
with an error message I’ve never seen.  Any ideas?  I’m not sure who exactly is running
“nice” or why its argument list would be too long.

Container for appattempt_1395755163053_0030_000001 exited with  exitCode: 0 due to: Exception
from container-launch:
java.io.IOException: Cannot run program ""nice"" (in directory ""/ephemeral02/hadoop/yarn/local/usercache/SYSTEM/appcache/application_1395755163053_0030/container_1395755163053_0030_01_000001""):
java.io.IOException: error=7, Argument list too long
                at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
                at org.apache.hadoop.util.Shell.runCommand(Shell.java:407)
                at org.apache.hadoop.util.Shell.run(Shell.java:379)
                at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:589)
                at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283)
                at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79)
                at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
                at java.util.concurrent.FutureTask.run(FutureTask.java:138)
                at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
                at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.IOException: java.io.IOException: error=7, Argument list too long
                at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
                at java.lang.ProcessImpl.start(ProcessImpl.java:65)
                at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
                ... 11 more

View raw message