hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krishna Kishore Bonagiri <write2kish...@gmail.com>
Subject Re: Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0
Date Thu, 25 Jul 2013 12:21:30 GMT
Hi Devaraj,

 I used to run this application with the same number of containers
successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing
with the new version, because YARN itself is also adding some more threads
than the previous versions?

Thanks,
Kishore


On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k <devaraj.k@huawei.com> wrote:

>  Hi Kishore,****
>
> ** **
>
> It seems that system doesn’t have enough resources to launch a new thread.
> ****
>
> ** **
>
> Could you check the system is affordable to launch the configured
> containers and try increasing the native memory available in the system by
> reducing the no of running processes in the system.****
>
> ** **
>
> Thanks****
>
> Devaraj k****
>
> ** **
>
> *From:* Krishna Kishore Bonagiri [mailto:write2kishore@gmail.com]
> *Sent:* 25 July 2013 16:09
> *To:* user@hadoop.apache.org
> *Subject:* Node manager crashing when running an app requiring 100
> containers on hadoop-2.1.0-beta RC0****
>
> ** **
>
> Hi,****
>
> ** **
>
>   I am running an application against hadoop-2.1.0-beta RC, and my app
> requires 117 containers, I have got all the containers allocated, but while
> starting those containers, at around 99th container the node manager has
> gone down with the following kind of error in it's log. Also, I could
> reproduce this error running a "sleep 200; date" command using the
> Distributed Shell example, in which case I got this error at around 66th
> container.****
>
> ** **
>
> ** **
>
> 2013-07-25 06:07:17,743 FATAL
> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process
> reaper,5,main] threw an Error.  Shutting down now...****
>
> java.lang.OutOfMemoryError: Failed to create a thread: retVal -1073741830,
> errno 11****
>
>         at java.lang.Thread.startImpl(Native Method)****
>
>         at java.lang.Thread.start(Thread.java:887)****
>
>         at java.lang.ProcessInputStream.<init>(UNIXProcess.java:472)****
>
>         at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)****
>
>         at
> java.security.AccessController.doPrivileged(AccessController.java:202)****
>
>         at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)****
>
> 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with
> status -1 Message: HaltException****
>
> ** **
>
> Thanks,****
>
> Kishore****
>

Mime
View raw message