hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Krishna Kishore Bonagiri <write2kish...@gmail.com>
Subject Re: 100% CPU consumption by Resource Manager process
Date Wed, 13 Aug 2014 08:42:48 GMT
Hi Wangda,
  Thanks for the reply, here are the details, please see if you could
suggest anything.

1) Number of nodes and running app in the cluster
2 nodes, and I am running my own application that keeps asking for
containers,
a) running something on the containers,
b) releasing the containers,
c) ask for more containers with incremented priority value, and repeat the
same process

2) What's the version of your Hadoop?
apache hadoop-2.4.0

3) Have you set
"yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
No

4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
your configuration?
50




On Tue, Aug 12, 2014 at 12:44 PM, Wangda Tan <wheeleast@gmail.com> wrote:

> Hi Krishna,
> To get more understanding about the problem, could you please share
> following information:
> 1) Number of nodes and running app in the cluster
> 2) What's the version of your Hadoop?
> 3) Have you set
> "yarn.scheduler.capacity.schedule-asynchronously.enable"=true?
> 4) What's the "yarn.resourcemanager.nodemanagers.heartbeat-interval-ms" in
> your configuration?
>
> Thanks,
> Wangda Tan
>
>
>
> On Sun, Aug 10, 2014 at 11:29 PM, Krishna Kishore Bonagiri <
> write2kishore@gmail.com> wrote:
>
>> Hi,
>>   My YARN resource manager is consuming 100% CPU when I am running an
>> application that is running for about 10 hours, requesting as many as 27000
>> containers. The CPU consumption was very low at the starting of my
>> application, and it gradually went high to over 100%. Is this a known issue
>> or are we doing something wrong?
>>
>> Every dump of the EVent Processor thread is running
>> LeafQueue::assignContainers() specifically the for loop below from
>> LeafQueue.java and seems to be looping through some priority list.
>>
>>     // Try to assign containers to applications in order
>>     for (FiCaSchedulerApp application : activeApplications) {
>> ...
>>         // Schedule in priority order
>>         for (Priority priority : application.getPriorities()) {
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               *CPU usage total: 42334.614623696 secs*
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=20456
>> (0x4FE8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:850(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42379.604203548 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=57280
>> (0xDFC0)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:841(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> 3XMTHREADINFO      "ResourceManager Event Processor"
>> J9VMThread:0x0000000001D08600, j9thread_t:0x00007F032D2FAA00,
>> java/lang/Thread:0x000000008341D9A0, state:CW, prio=5
>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x1E, isDaemon:false)
>> 3XMTHREADINFO1            (native thread ID:0x4B64, native priority:0x5,
>> native policy:UNKNOWN)
>> 3XMTHREADINFO2            (native stack address range
>> from:0x00007F0313DF8000, to:0x00007F0313E39000, size:0x41000)
>> 3XMCPUTIME               CPU usage total: 42996.394528764 secs
>> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=475576
>> (0x741B8)
>> 3XMTHREADINFO3           Java callstack:
>> 4XESTACKTRACE                at
>> java/util/TreeMap.successor(TreeMap.java:2001(Compiled Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$PrivateEntryIterator.nextEntry(TreeMap.java:1127(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> java/util/TreeMap$KeyIterator.next(TreeMap.java:1180(Compiled Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.assignContainers(LeafQueue.java:838(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/common/fica/FiCaSchedulerApp@0x000000008360DFE0,
>> entry count: 1)
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue@0x00000000833B9280,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainersToChildQueues(ParentQueue.java:655(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 2)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue.assignContainers(ParentQueue.java:569(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/ParentQueue@0x0000000083360A80,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.allocateContainersToNode(CapacityScheduler.java:831(Compiled
>> Code))
>> 5XESTACKTRACE                   (entered lock:
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler@0x00000000834037C8,
>> entry count: 1)
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:878(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacityScheduler.handle(CapacityScheduler.java:100(Compiled
>> Code))
>> 4XESTACKTRACE                at
>> org/apache/hadoop/yarn/server/resourcemanager/ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:591)
>> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:853)
>>
>> Thanks,
>> Kishore
>>
>
>

Mime
View raw message