hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-5545) App submit failure on queue with label when default queue partition capacity is zero
Date Wed, 26 Oct 2016 02:58:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-5545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15606182#comment-15606182
] 

Sunil G edited comment on YARN-5545 at 10/26/16 2:58 AM:
---------------------------------------------------------

Extremely for the comment. I mistyped in a wrong Jira. Pls discard below comment

.....
Currently we are trying to invoke activateApplications while recovering each application.
Yes, as of now nodes are getting registered later in the flow. But for scheduler, we need
not have to consider such timing cases from RMAppManager/RM end. Being said that, its important
to separate 2 issues out here
......


was (Author: sunilg):
Currently we are trying to invoke {{activateApplications}} while recovering each application.
Yes, as of now nodes are getting registered later in the flow. But for scheduler, we need
not have to consider such timing cases from RMAppManager/RM end. Being said that, its important
to separate 2 issues out here
- Recovery call flow for each app in Scheduler should not invoke {{activateApplications}}
every time
- {{activateApplications}} itself could be improved by considering AM head room. But that
could be done in another ticket, as this one is focusing on fixing recovery call flow.

To address issue 1, we could only invoke {{activateApplications}} once after recovering all
apps. By this, we can remove the timing dependency from RM end for recovery. With this change,
even if there is a change in RM recovery model, scheduler would have done its complete recovery
flow w/o causing any performance issue or waiting for resourceTrackerService to register nodes.
Thanks [~leftnoteasy].

Thoughts?

> App submit failure on queue with label when default queue partition capacity is zero
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-5545
>                 URL: https://issues.apache.org/jira/browse/YARN-5545
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>         Attachments: YARN-5545.0001.patch, YARN-5545.0002.patch, YARN-5545.0003.patch,
YARN-5545.004.patch, capacity-scheduler.xml
>
>
> Configure capacity scheduler 
> yarn.scheduler.capacity.root.default.capacity=0
> yarn.scheduler.capacity.root.queue1.accessible-node-labels.labelx.capacity=50
> yarn.scheduler.capacity.root.default.accessible-node-labels.labelx.capacity=50
> Submit application as below
> ./yarn jar ../share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-alpha2-SNAPSHOT-tests.jar
sleep -Dmapreduce.job.node-label-expression=labelx -Dmapreduce.job.queuename=default -m 1
-r 1 -mt 10000000 -rt 1
> {noformat}
> 2016-08-21 18:21:31,375 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/root/.staging/job_1471670113386_0001
> java.io.IOException: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit
application_1471670113386_0001 to YARN : org.apache.hadoop.security.AccessControlException:
Queue root.default already has 0 applications, cannot accept submission of application: application_1471670113386_0001
> 	at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:316)
> 	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:255)
> 	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1344)
> 	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1341)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1790)
> 	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1341)
> 	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1362)
> 	at org.apache.hadoop.mapreduce.SleepJob.run(SleepJob.java:273)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> 	at org.apache.hadoop.mapreduce.SleepJob.main(SleepJob.java:194)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:497)
> 	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
> 	at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
> 	at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:136)
> 	at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:144)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:497)
> 	at org.apache.hadoop.util.RunJar.run(RunJar.java:239)
> 	at org.apache.hadoop.util.RunJar.main(RunJar.java:153)
> Caused by: org.apache.hadoop.yarn.exceptions.YarnException: Failed to submit application_1471670113386_0001
to YARN : org.apache.hadoop.security.AccessControlException: Queue root.default already has
0 applications, cannot accept submission of application: application_1471670113386_0001
> 	at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(YarnClientImpl.java:286)
> 	at org.apache.hadoop.mapred.ResourceMgrDelegate.submitApplication(ResourceMgrDelegate.java:296)
> 	at org.apache.hadoop.mapred.YARNRunner.submitJob(YARNRunner.java:301)
> 	... 25 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message