hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krishna Kishore Bonagiri <write2kish...@gmail.com>
Subject Re: 100% CPU consumption by Resource Manager process
Date Tue, 19 Aug 2014 05:16:16 GMT
Thanks Wangda, I think I have reduced this when I was trying to reduce the
container allocation time.

-Kishore


On Tue, Aug 19, 2014 at 7:39 AM, Wangda Tan <wheeleast@gmail.com> wrote:

> Hi Krishna,
>
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
> 50
>
> I think this config is problematic, too small heartbeat-interval will
> cause NM contact RM too often. I would suggest you can set this value
> larger like 1000.
>
> Thanks,
> Wangda
>
>
>
> On Wed, Aug 13, 2014 at 4:42 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi Wangda,
>>   Thanks for the reply, here are the details, please see if you could
>> suggest anything.
>>
>> 1) Number of nodes and running app in the cluster
>> 2 nodes, and I am running my own application that keeps asking for
>> containers,
>> a) running something on the containers,
>> b) releasing the containers,
>> c) ask for more containers with incremented priority value, and repeat
>> the same process
>>
>> 2) What's the version of your Hadoop?
>> apache hadoop-2.4.0
>>
>> 3) Have you set
>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>> No
>>
>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>> in your configuration?
>> 50
>>
>>
>>
>>
>> On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wheeleast@gmail.com> wrote:
>>
>>> Hi Krishna,
>>> To get more understanding about the problem, could you please share
>>> following information:
>>> 1) Number of nodes and running app in the cluster
>>> 2) What's the version of your Hadoop?
>>> 3) Have you set
>>> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
>>> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms"
>>> in your configuration?
>>>
>>> Thanks,
>>> Wangda Tan
>>>
>>>
>>>
>>> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
>>> write2kishore@gmail.com> wrote:
>>>
>>>> Hi,
>>>>   My YARN resource manager is consuming 100% CPU when I am running an
>>>> application that is running for about 10 hours, requesting as many as 27000
>>>> containers. The CPU consumption was very low at the starting of my
>>>> application, and it gradually went high to over 100%. Is this a known issue
>>>> or are we doing something wrong?
>>>>
>>>> Every dump of the EVent Processor thread is running
>>>> LeafQueue::assignContainers() specifically the for loop below from
>>>> LeafQueue.java and seems to be looping through some priority list.
>>>>
>>>>     // Try to assign containers to applications in order
>>>>     for (FiCaSchedulerApp application : activeApplications) {
>>>> ...
>>>>         // Schedule in priority order
>>>>         for (Priority priority : application.getPriorities()) {
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>>>> (0x4FE8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>>>> (0xDFC0)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> 3XMTHREADINFO      "ResourceManager Event Processor"
>>>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>>>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>>>> 3XMTHREADINFO1            (native thread ID:0x4B64, native
>>>> priority:0x5, native policy:UNKNOWN)
>>>> 3XMTHREADINFO2            (native stack address range
>>>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>>>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC
>>>> cycle=475576 (0x741B8)
>>>> 3XMTHREADINFO3           Java callstack:
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>>>> entry count: 1)
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 2)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>>>> Code))
>>>> 5XESTACKTRACE                   (entered lock:
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>>>> entry count: 1)
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>>>> Code))
>>>> 4XESTACKTRACE                at
>>>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>>>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>>>
>>>> Thanks,
>>>> Kishore
>>>>
>>>
>>>
>>
>

Mime
View raw message