hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2091) Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
Date Mon, 02 Jun 2014 17:32:03 GMT

    [ https://issues.apache.org/jira/browse/YARN-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14015611#comment-14015611
] 

Bikas Saha commented on YARN-2091:
----------------------------------

Can this miss a case when the exitCode has not been set (eg when the container crashes on
its own)? Should we check if the exitCode has already been set (eg. via a kill event) and
if its not set then set it from exitEvent? How can we check if the exitCode has not been set?
Maybe have some uninitialized/invalid default value.
{code}@@ -829,7 +829,6 @@ public void transition(ContainerImpl container, ContainerEvent event)
{
     @Override
     public void transition(ContainerImpl container, ContainerEvent event) {
       ContainerExitEvent exitEvent = (ContainerExitEvent) event;
-      container.exitCode = exitEvent.getExitCode();{code}

The new exit status code need better comments/docs. E.g. what is the difference between to
2 new appmaster related exit status. Is kill_by_resourcemanager a generic value that can be
replaced later on by a more specific reason like preempted?

> Add ContainerExitStatus.KILL_EXCEEDED_MEMORY and pass it to app masters
> -----------------------------------------------------------------------
>
>                 Key: YARN-2091
>                 URL: https://issues.apache.org/jira/browse/YARN-2091
>             Project: Hadoop YARN
>          Issue Type: Task
>            Reporter: Bikas Saha
>            Assignee: Tsuyoshi OZAWA
>         Attachments: YARN-2091.1.patch, YARN-2091.2.patch, YARN-2091.3.patch, YARN-2091.4.patch,
YARN-2091.5.patch, YARN-2091.6.patch
>
>
> Currently, the AM cannot programmatically determine if the task was killed due to using
excessive memory. The NM kills it without passing this information in the container status
back to the RM. So the AM cannot take any action here. The jira tracks adding this exit status
and passing it from the NM to the RM and then the AM. In general, there may be other such
actions taken by YARN that are currently opaque to the AM. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message