hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haibo Chen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-8807) FairScheduler crashes RM with oversubscription turned on if an application is killed.
Date Thu, 20 Sep 2018 21:36:00 GMT

     [ https://issues.apache.org/jira/browse/YARN-8807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Haibo Chen updated YARN-8807:
-----------------------------
    Description: 
When an application, that has got opportunistic containers allocated, is killed, its containers
are not released immediately.

Fair scheduler would therefore continue to try to promote such orphaned containers, which
results in NPE.
{code:java}
java.lang.NullPointerException
    at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptToAssignReservedResourcesOrPromoteOpportunisticContainers(FairScheduler.java:1158)
    at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1129)
    at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:1001)
    at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1275)
    at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testKillingApplicationWithOpportunisticContainersAssigned(TestFairScheduler.java:4019){code}

> FairScheduler crashes RM with oversubscription turned on if an application is killed.
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-8807
>                 URL: https://issues.apache.org/jira/browse/YARN-8807
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: fairscheduler, resourcemanager
>    Affects Versions: YARN-1011
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>            Priority: Major
>
> When an application, that has got opportunistic containers allocated, is killed, its
containers are not released immediately.
> Fair scheduler would therefore continue to try to promote such orphaned containers, which
results in NPE.
> {code:java}
> java.lang.NullPointerException
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptToAssignReservedResourcesOrPromoteOpportunisticContainers(FairScheduler.java:1158)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.attemptScheduling(FairScheduler.java:1129)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:1001)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:1275)
>     at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testKillingApplicationWithOpportunisticContainersAssigned(TestFairScheduler.java:4019){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message