hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Siddhi Mehta <smehtau...@gmail.com>
Subject Re: Yarn job stuck with no application master being assigned
Date Sat, 22 Jun 2013 01:07:40 GMT
That solved the problem. Thanks Sandy!!

What is the optimal setting for
yarn.scheduler.capacity.maximum-am-resource-percent
in terms of node manager.
What are the consequences of setting to a higher value?
Also, I noticed that by default application master needs 1.5GB. Are there
any side effects we will face if I lower that to 1GB

Siddhi


On Fri, Jun 21, 2013 at 4:28 PM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> Hi Siddhi,
>
> Moving this question to the CDH list.
>
> Does setting yarn.scheduler.capacity.maximum-am-resource-percent to .5
> help?
>
> Have you tried using the Fair Scheduler?
>
> -Sandy
>
>
> On Fri, Jun 21, 2013 at 4:21 PM, Siddhi Mehta <smehtauser@gmail.com>wrote:
>
>> Hey All,
>>
>> I am running a Hadoop 2.0(cdh4.2.1) cluster on a single node with 1
>> NodeManager.
>>
>> We have an Map only job that launches a pig job on the cluster(similar to
>> what oozie does)
>>
>> We are seeing that the map only job launches the pig script but the pig
>> job is stuck in ACCEPTED state with no trackingUI assigned.
>>
>> I dont see any error in the nodemanager logs or the resource manager logs
>> as such.
>>
>>
>> On the nodemanager i see this logs
>> 2013-06-21 15:05:13,084 INFO  capacity.ParentQueue - assignedContainer
>> queue=root usedCapacity=0.4 absoluteUsedCapacity=0.4 used=memory: 2048
>> cluster=memory: 5120
>>
>> 2013-06-21 15:05:38,898 INFO  capacity.CapacityScheduler - Application
>> Submission: appattempt_1371850881510_0003_000001, user: smehta queue:
>> default: capacity=1.0, absoluteCapacity=1.0, usedResources=2048MB,
>> usedCapacity=0.4, absoluteUsedCapacity=0.4, numApps=2, numContainers=2,
>> currently active: 2
>>
>> Which suggests that the cluster has capacity but still no application
>> master is assigned to it.
>> What am I missing?Any help is appreciated.
>>
>> I keep seeing this logs on the node manager
>> 2013-06-21 16:19:37,675 INFO  monitor.ContainersMonitorImpl - Memory
>> usage of ProcessTree 12484 for container-id
>> container_1371850881510_0002_01_000002: 157.1mb of 1.0gb physical memory
>> used; 590.1mb of 2.1gb virtual memory used
>> 2013-06-21 16:19:37,696 INFO  monitor.ContainersMonitorImpl - Memory
>> usage of ProcessTree 12009 for container-id
>> container_1371850881510_0002_01_000001: 181.0mb of 1.0gb physical memory
>> used; 1.4gb of 2.1gb virtual memory used
>> 2013-06-21 16:19:37,946 INFO  nodemanager.NodeStatusUpdaterImpl - Sending
>> out status for container: container_id {, app_attempt_id {, application_id
>> {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 1, },
>> state: C_RUNNING, diagnostics: "", exit_status: -1000,
>> 2013-06-21 16:19:37,946 INFO  nodemanager.NodeStatusUpdaterImpl - Sending
>> out status for container: container_id {, app_attempt_id {, application_id
>> {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 2, },
>> state: C_RUNNING, diagnostics: "", exit_status: -1000,
>> 2013-06-21 16:19:38,948 INFO  nodemanager.NodeStatusUpdaterImpl - Sending
>> out status for container: container_id {, app_attempt_id {, application_id
>> {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 1, },
>> state: C_RUNNING, diagnostics: "", exit_status: -1000,
>> 2013-06-21 16:19:38,948 INFO  nodemanager.NodeStatusUpdaterImpl - Sending
>> out status for container: container_id {, app_attempt_id {, application_id
>> {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 2, },
>> state: C_RUNNING, diagnostics: "", exit_status: -1000,
>> 2013-06-21 16:19:39,950 INFO  nodemanager.NodeStatusUpdaterImpl - Sending
>> out status for container: container_id {, app_attempt_id {, application_id
>> {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 1, },
>> state: C_RUNNING, diagnostics: "", exit_status: -1000,
>> 2013-06-21 16:19:39,950 INFO  nodemanager.NodeStatusUpdaterImpl - Sending
>> out status for container: container_id {, app_attempt_id {, application_id
>> {, id: 2, cluster_timestamp: 1371850881510, }, attemptId: 1, }, id: 2, },
>> state: C_RUNNING, diagnostics: "", exit_status: -1000,
>>
>> Here are my memory configurations
>>
>> <property>
>> <name>yarn.nodemanager.resource.memory-mb</name>
>> <value>5120</value>
>> <source>yarn-site.xml</source>
>> </property>
>>
>> property>
>> <name>mapreduce.map.memory.mb</name>
>> <value>512</value>
>> <source>mapred-site.xml</source>
>> </property>
>>
>> <property>
>> <name>mapreduce.reduce.memory.mb</name>
>> <value>512</value>
>> <source>mapred-site.xml</source>
>> </property>
>>
>> <property>
>> <name>mapred.child.java.opts</name>
>> <value>
>> -Xmx512m -Djava.net.preferIPv4Stack=true -XX:+UseCompressedOops
>> -XX:+HeapDumpOnOutOfMemoryError
>> -XX:HeapDumpPath=/home/sfdc/logs/hadoop/userlogs/@taskid@/
>> </value>
>> <source>mapred-site.xml</source>
>> </property>
>>
>> <property>
>> <name>yarn.app.mapreduce.am.resource.mb</name>
>> <value>1024</value>
>> <source>mapred-site.xml</source>
>> </property>
>>
>> Regards,
>> Siddhi
>>
>
>

Mime
View raw message