hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krishna Kishore Bonagiri <write2kish...@gmail.com>
Subject Re: Node manager crashing when running an app requiring 100 containers on hadoop-2.1.0-beta RC0
Date Thu, 01 Aug 2013 04:46:50 GMT
Hi Arun,
 I was running on a single node cluster, so all my 100+ containers are on
single node. And, the problem is gone when I increased YARN_HEAP_SIZE to
2GB.

Thanks,
Kishore


On Thu, Aug 1, 2013 at 5:01 AM, Arun C Murthy <acm@hortonworks.com> wrote:

> How many containers are you running per node?
>
> On Jul 25, 2013, at 5:21 AM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
> Hi Devaraj,
>
>  I used to run this application with the same number of containers
> successfully on previous version, i.e. hadoop-2.0.4-alpha. Is it failing
> with the new version, because YARN itself is also adding some more threads
> than the previous versions?
>
> Thanks,
> Kishore
>
>
> On Thu, Jul 25, 2013 at 4:24 PM, Devaraj k <devaraj.k@huawei.com> wrote:
>
>>  Hi Kishore,****
>>
>> ** **
>>
>> It seems that system doesn’t have enough resources to launch a new
>> thread. ****
>>
>> ** **
>>
>> Could you check the system is affordable to launch the configured
>> containers and try increasing the native memory available in the system by
>> reducing the no of running processes in the system.****
>>
>> ** **
>>
>> Thanks****
>>
>> Devaraj k****
>>
>> ** **
>>
>> *From:* Krishna Kishore Bonagiri [mailto:write2kishore@gmail.com]
>> *Sent:* 25 July 2013 16:09
>> *To:* user@hadoop.apache.org
>> *Subject:* Node manager crashing when running an app requiring 100
>> containers on hadoop-2.1.0-beta RC0****
>>
>> ** **
>>
>> Hi,****
>>
>> ** **
>>
>>   I am running an application against hadoop-2.1.0-beta RC, and my app
>> requires 117 containers, I have got all the containers allocated, but while
>> starting those containers, at around 99th container the node manager has
>> gone down with the following kind of error in it's log. Also, I could
>> reproduce this error running a "sleep 200; date" command using the
>> Distributed Shell example, in which case I got this error at around 66th
>> container.****
>>
>> ** **
>>
>> ** **
>>
>> 2013-07-25 06:07:17,743 FATAL
>> org.apache.hadoop.yarn.YarnUncaughtExceptionHandler: Thread Thread[process
>> reaper,5,main] threw an Error.  Shutting down now...****
>>
>> java.lang.OutOfMemoryError: Failed to create a thread: retVal
>> -1073741830, errno 11****
>>
>>         at java.lang.Thread.startImpl(Native Method)****
>>
>>         at java.lang.Thread.start(Thread.java:887)****
>>
>>         at java.lang.ProcessInputStream.<init>(UNIXProcess.java:472)****
>>
>>         at java.lang.UNIXProcess$1$1$1.run(UNIXProcess.java:157)****
>>
>>         at
>> java.security.AccessController.doPrivileged(AccessController.java:202)***
>> *
>>
>>         at java.lang.UNIXProcess$1$1.run(UNIXProcess.java:137)****
>>
>> 2013-07-25 06:07:17,745 INFO org.apache.hadoop.util.ExitUtil: Halt with
>> status -1 Message: HaltException****
>>
>> ** **
>>
>> Thanks,****
>>
>> Kishore****
>>
>
>
> --
> Arun C. Murthy
> Hortonworks Inc.
> http://hortonworks.com/
>
>
>

Mime
View raw message