hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2825) Container leak on NM
Date Fri, 07 Nov 2014 17:21:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202287#comment-14202287
] 

Jason Lowe commented on YARN-2825:
----------------------------------

Thanks for the patch, Jian!

Is there a reason we need to cast to ContainerImpl?  I think calling context.getContainers().get(containerId).getContainerState()
== org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerState.DONE
would be equivalent and cleaner since we wouldn't assume the container implementation.  Or
we could get the container status and check for COMPLETE which is what other parts of the
code are doing.

> Container leak on NM
> --------------------
>
>                 Key: YARN-2825
>                 URL: https://issues.apache.org/jira/browse/YARN-2825
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Jian He
>            Priority: Critical
>         Attachments: YARN-2825.1.patch, YARN-2825.1.patch
>
>
> Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
> The problem is that in YARN-1372 we changed the behavior to remove containers from NMContext
only after the containers are acknowledged  by AM. But in the {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}}
call, we didn't check whether the container is really completed or not.  If the container
is stilll running, we shouldn't remove the container from the context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message