hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith Sharma K S (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4479) Retrospect app-priority in pendingOrderingPolicy during recovering applications
Date Mon, 21 Dec 2015 10:20:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15066282#comment-15066282

Rohith Sharma K S commented on YARN-4479:

Scenario which cause issue is
# Submitted the app-1 and app-2 with priority 5. Both applications are activated and RUNNING
# Submit app-3 with priority 6. This application is in pending state because of AMLimit.
# RM restarted, app-1 application is activated(it is behavior that for 1st application AMLimit
is not considered) and app-2 and app-3 are in pendingOrderingPolicy
# AM re-registered for app-1 and app-2. Its state is now RUNNING. But app-2 and app-3 are
still in pendingapplications.
# NodeManager re-registered with RM. As a result 1 application supposed to be get activated.
Here, always app-3 get activated since app-3 priority is higher, but app-2 should get activated
first since it is running before RMrestart.

> Retrospect app-priority in pendingOrderingPolicy during recovering applications
> -------------------------------------------------------------------------------
>                 Key: YARN-4479
>                 URL: https://issues.apache.org/jira/browse/YARN-4479
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Rohith Sharma K S
>         Attachments: 0001-YARN-4479.patch
> Currently, same ordering policy is used for pending applications and active applications.
When priority is configured for an applications, during recovery high priority application
get activated first. It is possible that low priority job was submitted and running state.

> This causes low priority job in starvation after recovery

This message was sent by Atlassian JIRA

View raw message