hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Varun Saxena (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-6157) Inconsistencies in verifying Max Applications
Date Thu, 09 Feb 2017 08:36:41 GMT

    [ https://issues.apache.org/jira/browse/YARN-6157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15859205#comment-15859205

Varun Saxena commented on YARN-6157:

[~Naganarasimha], the check for system max apps is made in CapacityScheduler#addApplication
and flow for recovered apps goes via CapacityScheduler#addApplicationOnRecovery. So recovered
apps I think will be excluded.
Other point is right. We probably need a counter in LeafQueue which will be increment when
we call LeafQueue#submitApplication. We have a similar counter in ParentQueue too.

> Inconsistencies in verifying Max Applications
> ---------------------------------------------
>                 Key: YARN-6157
>                 URL: https://issues.apache.org/jira/browse/YARN-6157
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
> Inconsistencies in verifying Max Applications when the max apps is reduced and either
HA is done ow work preserving restart is done.
> # currently Max applications across cluster should not be done for the recovered apps.
Seems like currently we are doing it
> #  Max applications for a queue is done @ CapacityScheduler.addApplication which considers
sum of Pending and running applications but we add to pending applications in {{CapacityScheduler.addApplicationAttempt
-> LeafQueue.addApplicationAttempt}} so between these 2 checks we can activate more apps
than what can queue restrict.
> # During recovery of a RMApp, if applicationAttempts are not found then we recover it
without recovery false @ {{RMAppImpl.RMAppRecoveredTransition}}, this can lead to failure
of apps which were accepted earlier but attempt was not yet created and HA happens when MAX
app configuration (for cluster/queue) is modified.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message