hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wangda Tan <wheele...@gmail.com>
Subject Re: YARN CapacityScheduler stuck trying to fulfill reservation
Date Fri, 15 Apr 2016 00:39:20 GMT
It seems you hit MAPREDUCE-6302.

Patch it yourself or waiting for release of 2.7.3 should solve your problem.

On Wed, Apr 13, 2016 at 11:27 AM, Joseph Naegele <
jnaegele@grierforensics.com> wrote:

> I'm using Hadoop 2.7.1.
>
> I'm running on MR job on 9 nodes. Everything was working smoothly until it
> reached (map 99%, reduce 10%). Here's the relevant lines from my
> ResourceManager logs:
>
> 2016-04-13 14:19:07,930 INFO
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySch
> eduler: Trying to fulfill reservation for application
> application_1460557956992_0002 on node: ip-10-0-3-14:36536
> 2016-04-13 14:19:07,930 INFO
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue:
> Reserved container  application=application_1460557956992_0002
> resource=<memory:5000, vCores:1> queue=default: capacity=1.0,
> absoluteCapacity=1.0, usedResources=<memory:47500, vCores:10>,
> usedCapacity=0.8590133, absoluteUsedCapacity=0.8590133, numApps=1,
> numContainers=10 usedCapacity=0.8590133 absoluteUsedCapacity=0.8590133
> used=<memory:47500, vCores:10> cluster=<memory:55296, vCores:27>
> 2016-04-13 14:19:07,930 INFO
>
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySch
> eduler: Skipping scheduling since node ip-10-0-3-14:36536 is reserved by
> application appattempt_1460557956992_0002_000001
>
> Those three lines have repeated for the past hour and the MR job has not
> progressed. The node in question (ip-10-0-3-14) is running the
> ApplicationManager. From what I can tell, it looks like I'm at capacity and
> the scheduler got itself stuck unable to allocate the next needed
> container,
> although my understanding is pretty limited.
>
> Here's the resource sections of my mapred-site.xml and yarn-site.xml:
>
> yarn-site.xml:
> <property>
>   <name>yarn.nodemanager.resource.memory-mb</name>
>   <value>6144</value>
> </property>
> <property>
>   <name>yarn.nodemanager.resource.cpu-vcores</name>
>   <value>3</value>
> </property>
> <property>
>   <name>yarn.scheduler.minimum-allocation-mb</name>
>   <value>2500</value>
> </property>
> <property>
>   <name>yarn.scheduler.minimum-allocation-vcores</name>
>   <value>1</value>
> </property>
>
> mapred-site.xml:
> <property>
>   <name>mapreduce.map.memory.mb</name>
>   <value>3000</value>
> </property>
> <property>
>   <name>mapreduce.map.cpu.vcores</name>
>   <value>1</value>
> </property>
> <property>
>   <name>mapreduce.reduce.memory.mb</name>
>   <value>3000</value>
> </property>
> <property>
>   <name>mapreduce.reduce.cpu.vcores</name>
>   <value>2</value>
> </property>
> <property>
>   <name>mapreduce.map.java.opts</name>
>   <value>-Xmx896m</value>
> </property>
> <property>
>   <name>mapreduce.reduce.java.opts</name>
>   <value>-Xmx1536m</value>
> </property>
>
> Any ideas as to what's going on, or how to prevent this?
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: user-help@hadoop.apache.org
>
>

Mime
View raw message