hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anubhav Dhoot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2244) FairScheduler missing handling of containers for unknown application attempts
Date Wed, 16 Jul 2014 22:45:06 GMT

    [ https://issues.apache.org/jira/browse/YARN-2244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14064275#comment-14064275

Anubhav Dhoot commented on YARN-2244:

Fixed other issues except

>Can we use AbstractYarnScheduler#killOrphanContainerOnNode() instead
Right now the {{containerLaunchedOnNode}} code is same as Capacity and Fifo with what it does.
If we want to unify the code across the schedulers, i would prefer leave it  like this and
then get rid of all three implementations and have a single  {{containerLaunchedOnNode}} in
AbstractYarnScheduler instead.

>Parametrize the method to also take number of container cleanups to wait for and use it
The only other user in this class has some additional logic for application cleanup. Unifying
across those might make it more complicated than its worth.


> FairScheduler missing handling of containers for unknown application attempts 
> ------------------------------------------------------------------------------
>                 Key: YARN-2244
>                 URL: https://issues.apache.org/jira/browse/YARN-2244
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: fairscheduler
>            Reporter: Anubhav Dhoot
>            Assignee: Anubhav Dhoot
>            Priority: Critical
>         Attachments: YARN-2224.patch, YARN-2244.001.patch, YARN-2244.002.patch
> We are missing changes in patch MAPREDUCE-3596 in FairScheduler. Among other fixes that
were common across schedulers, there were some scheduler specific fixes added to handle containers
for unknown application attempts. Without these fair scheduler simply logs that an unknown
container was found and continues to let it run. 

This message was sent by Atlassian JIRA

View raw message