hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits
Date Sat, 04 Jan 2014 17:08:52 GMT

    [ https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862350#comment-13862350
] 

Bikas Saha commented on YARN-1490:
----------------------------------

bq. The failed attempt is changed to still receive container events and record the finished
containers and new attempt is created with the reference of the objects of the previous attempt.
This sounds messy. IMO having 2 app attempts objects being active is going to be a source
of bugs and race conditions. We are better off changing the dispatcher related logic to look
up the appId of the container, get the current attempt of that appId and then route the event
to the current event.

> RM should optionally not kill all containers when an ApplicationMaster exits
> ----------------------------------------------------------------------------
>
>                 Key: YARN-1490
>                 URL: https://issues.apache.org/jira/browse/YARN-1490
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Jian He
>         Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch
>
>
> This is needed to enable work-preserving AM restart. Some apps can chose to reconnect
with old running containers, some may not want to. This should be an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message