hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8990) Fix fair scheduler race condition in app submit and queue cleanup
Date Fri, 09 Nov 2018 00:27:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680656#comment-16680656

Hudson commented on YARN-8990:

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #15393 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/15393/])
YARN-8990. Fix fair scheduler race condition in app submit and queue (haibochen: rev 524a7523c427b55273133078898ae3535897bada)
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/QueueManager.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FairScheduler.java
* (edit) hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fair/FSLeafQueue.java

> Fix fair scheduler race condition in app submit and queue cleanup
> -----------------------------------------------------------------
>                 Key: YARN-8990
>                 URL: https://issues.apache.org/jira/browse/YARN-8990
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>    Affects Versions: 3.2.0
>            Reporter: Wilfred Spiegelenburg
>            Assignee: Wilfred Spiegelenburg
>            Priority: Blocker
>             Fix For: 3.3.0
>         Attachments: YARN-8990.001.patch, YARN-8990.002.patch
> With the introduction of the dynamic queue deletion in YARN-8191 a race condition was
introduced that can cause a queue to be removed while an application submit is in progress.
> The issue occurs in {{FairScheduler.addApplication()}} when an application is submitted
to a dynamic queue which is empty or the queue does not exist yet. If during the processing
of the application submit the {{AllocationFileLoaderService}} kicks of for an update the queue
clean up will be run first. The application submit first creates the queue and get a reference
back to the queue. 
> Other checks are performed and as the last action before getting ready to generate an
AppAttempt the queue is updated to show the submitted application ID..
> The time between the queue creation and the queue update to show the submit is long enough
for the queue to be removed. The application however is lost and will never get any resources

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message