hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1070) ContainerImpl State Machine: Invalid event: CONTAINER_KILLED_ON_REQUEST at CONTAINER_CLEANEDUP_AFTER_KILL
Date Thu, 05 Sep 2013 01:16:54 GMT

     [ https://issues.apache.org/jira/browse/YARN-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Zhijie Shen updated YARN-1070:

    Attachment: YARN-1070.3.patch

Thanks Vinod for your review. I've updated the patch accordingly. The important change in
this patch is that I removed the logic of canceling ContainerLaunch.call(), and in call(),
I checked the container state first, returned immediately if the container is not at LOCALIZED,
and send CONTAINER_KILLED_ON_REQUEST if necessary.

The rationale of checking the container state is that the thread of ContainerLaunch.call()
is scheduled and should be executed after the container enters LOCALIZED. As this thread can
run parallel with the thread of ContainerImpl, the container is free to move on to some other
state, which can be either RUNNING, EXIT_WITH_FAILURE or KILLING. The first two should be
triggered by the event send from ContainerLaunch.call(), while KILLING is caused by a kill

Therefore, when ContainerLaunch.call() is started, we check the container state. If it is
KILLING, ContainerLaunch.call() can stop immediately, which is equivalent to the cancel operation
which is removed in ContainersLauncher. Actually, it should even be better, because Future.cancel
will not terminate call() immediately.

On the other side, if at this point the container state is still LOCALIZED, call() will move
on. Then, if the container state changes to KILLING in the midway, we just ignore it let call()
finish as usual. It does no harm because when the container reaches KILLING, CLEANUP_CONTAINER
is scheduled or is started.
> ---------------------------------------------------------------------------------------------------------
>                 Key: YARN-1070
>                 URL: https://issues.apache.org/jira/browse/YARN-1070
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>            Reporter: Hitesh Shah
>            Assignee: Zhijie Shen
>         Attachments: YARN-1070.1.patch, YARN-1070.2.patch, YARN-1070.3.patch

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message