hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arun C Murthy <...@hortonworks.com>
Subject Re: Yarn job stuck with no application master being assigned
Date Sat, 22 Jun 2013 22:46:42 GMT
Siddhi,

On Jun 21, 2013, at 6:07 PM, Siddhi Mehta <smehtauser@gmail.com> wrote:

> That solved the problem. Thanks Sandy!!
> 
> What is the optimal setting for yarn.scheduler.capacity.maximum-am-resource-percent in
terms of node manager. 
> What are the consequences of setting to a higher value?

This means that more AMs will be active concurrently.

One thing to remember: in terms of getting *real* work done an AM is, kinda, pure overhead
(currently) in the sense that it does not do actual data-processing - this is true of the
MR AM; but really depends on how the AM is implemented. An AM *may* choose to do some actual
work of course - depends on the implementation.

With that context, If you have a very small cluster, then too many containers might be used
for running AMs with higher values for yarn.scheduler.capacity.maximum-am-resource-percent
and overall utilization might go low. As a result, you want to be aware of this.


> Also, I noticed that by default application master needs 1.5GB. Are there any side effects
we will face if I lower that to 1GB
I have tried AMs with as low as 200M for small jobs. It really depends on how many tasks you
want your job to manage.

Arun

> 
> Siddhi
> 
> 
> On Fri, Jun 21, 2013 at 4:28 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:
> Hi Siddhi,
> 
> Moving this question to the CDH list.
> 
> Does setting yarn.scheduler.capacity.maximum-am-resource-percent to .5 help?
> 
> Have you tried using the Fair Scheduler?
> 
> -Sandy
> 
> 
> On Fri, Jun 21, 2013 at 4:21 PM, Siddhi Mehta <smehtauser@gmail.com> wrote:
> Hey All,
> 
> I am running a Hadoop 2.0(cdh4.2.1) cluster on a single node with 1 NodeManager.
> 
> We have an Map only job that launches a pig job on the cluster(similar to what oozie
does)
> 
> We are seeing that the map only job launches the pig script but the pig job is stuck
in ACCEPTED state with no trackingUI assigned.
> 
> I dont see any error in the nodemanager logs or the resource manager logs as such.
> 
> 
> On the nodemanager i see this logs 
> 2013-06-21 15:05:13,084 INFO  capacity.ParentQueue - assignedContainer queue=root usedCapacity=0.4
absoluteUsedCapacity=0.4 used=memory: 2048 cluster=memory: 5120
> 
> 2013-06-21 15:05:38,898 INFO  capacity.CapacityScheduler - Application Submission: appattempt_1371850881510_0003_000001,
user: smehta queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=2048MB, usedCapacity=0.4,
absoluteUsedCapacity=0.4, numApps=2, numContainers=2, currently active: 2
> 
> Which suggests that the cluster has capacity but still no application master is assigned
to it.
> What am I missing?Any help is appreciated.
> 
> I keep seeing this logs on the node manager 
> 2013-06-21 16:19:37,675 INFO  monitor.ContainersMonitorImpl - Memory usage of ProcessTree
12484 for container-id container_1371850881510_0002_01_000002: 157.1mb of 1.0gb physical memory
used; 590.1mb of 2.1gb virtual memory used
> 2013-06-21 16:19:37,696 INFO  monitor.ContainersMonitorImpl - Memory usage of ProcessTree
12009 for container-id container_1371850881510_0002_01_000001: 181.0mb of 1.0gb physical memory
used; 1.4gb of 2.1gb virtual memory used
> 2013-06-21 16:19:37,946 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status
for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp:
1371850881510, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "", exit_status:
-1000, 
> 2013-06-21 16:19:37,946 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status
for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp:
1371850881510, }, attemptId: 1, }, id: 2, }, state: C_RUNNING, diagnostics: "", exit_status:
-1000, 
> 2013-06-21 16:19:38,948 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status
for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp:
1371850881510, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "", exit_status:
-1000, 
> 2013-06-21 16:19:38,948 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status
for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp:
1371850881510, }, attemptId: 1, }, id: 2, }, state: C_RUNNING, diagnostics: "", exit_status:
-1000, 
> 2013-06-21 16:19:39,950 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status
for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp:
1371850881510, }, attemptId: 1, }, id: 1, }, state: C_RUNNING, diagnostics: "", exit_status:
-1000, 
> 2013-06-21 16:19:39,950 INFO  nodemanager.NodeStatusUpdaterImpl - Sending out status
for container: container_id {, app_attempt_id {, application_id {, id: 2, cluster_timestamp:
1371850881510, }, attemptId: 1, }, id: 2, }, state: C_RUNNING, diagnostics: "", exit_status:
-1000, 
> 
> Here are my memory configurations
> 
> <property>
> <name>yarn.nodemanager.resource.memory-mb</name>
> <value>5120</value>
> <source>yarn-site.xml</source>
> </property>
> 
> property>
> <name>mapreduce.map.memory.mb</name>
> <value>512</value>
> <source>mapred-site.xml</source>
> </property>
> 
> <property>
> <name>mapreduce.reduce.memory.mb</name>
> <value>512</value>
> <source>mapred-site.xml</source>
> </property>
> 
> <property>
> <name>mapred.child.java.opts</name>
> <value>
> -Xmx512m -Djava.net.preferIPv4Stack=true -XX:+UseCompressedOops -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/home/sfdc/logs/hadoop/userlogs/@taskid@/
> </value>
> <source>mapred-site.xml</source>
> </property>
> 
> <property>
> <name>yarn.app.mapreduce.am.resource.mb</name>
> <value>1024</value>
> <source>mapred-site.xml</source>
> </property>
> 
> Regards,
> Siddhi
> 
> 

--
Arun C. Murthy
Hortonworks Inc.
http://hortonworks.com/



Mime
View raw message