hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits
Date Thu, 09 Jan 2014 03:32:57 GMT

    [ https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13866264#comment-13866264
] 

Jian He commented on YARN-1490:
-------------------------------

Uploaded a new patch 

bq. Change it also transfer running containers too?
this is already there.
bq. Extend the test to explicitly validate that allocated/acquired/reserved are killed ?
acquired  to be killed test is already there. Add test for allocated container to be killed.
bq. That TODO will be fixed in this ticket itself? Or a separate one?
fixed.
bq. We need to see if the remaining state needs to be transferred to. There is some commented
code.
I intentionally commented those, in case I forgot, I think for now those are not needed yet.
bq. application.setShouldRecover(false); 
Earlier I thought to reset the flag, but given we are always passing the correct flag via
AppAttemptRemovedSchedulerEvent, remove it.
bq. The following code-block can be in the constructor?  if (!appAttempt.submissionContext.getCleanContainersWhenFail())
{
bq. Similarly the check for unmanaged AM can also be in the constructor?
This flag should apply only when the attempt is failed, not for killed/ finished. right? Added
test cases for this.
bq. Can you file a ticket for the broken stuff in unmanaged-AM?
done. YARN-1577

> RM should optionally not kill all containers when an ApplicationMaster exits
> ----------------------------------------------------------------------------
>
>                 Key: YARN-1490
>                 URL: https://issues.apache.org/jira/browse/YARN-1490
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Jian He
>         Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch, YARN-1490.4.patch
>
>
> This is needed to enable work-preserving AM restart. Some apps can chose to reconnect
with old running containers, some may not want to. This should be an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message