hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2825) Container leak on NM
Date Fri, 07 Nov 2014 19:30:37 GMT

    [ https://issues.apache.org/jira/browse/YARN-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14202521#comment-14202521
] 

Jason Lowe commented on YARN-2825:
----------------------------------

Curious, what's the reasoning to avoid checking for the DONE state?  That's utilizing an existing
interface rather than adding a new one.  It's confusing to have two get state methods (current
vs. container, but it _is_ a container so...), although I realize it derives from the undesirable
situation of having two ContainerState types.  I suppose we could also just expose an isComplete()
predicate method since callers are only checking for COMPLETE.

That being said I'm not totally against adding the yarn.api.records state in a new method,
just think it's not really necessary.  If we continue along that route then ContainerImpl.getCurrentState
needs the Override decorator.  Also ContainerImpl is now an usused import, and should continue
to be unnecessary regardless of which route we pursue.


> Container leak on NM
> --------------------
>
>                 Key: YARN-2825
>                 URL: https://issues.apache.org/jira/browse/YARN-2825
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Jian He
>            Assignee: Jian He
>            Priority: Critical
>         Attachments: YARN-2825.1.patch, YARN-2825.1.patch, YARN-2825.2.patch
>
>
> Caused by YARN-1372. thanks [~vinodkv] for pointing  this out.
> The problem is that in YARN-1372 we changed the behavior to remove containers from NMContext
only after the containers are acknowledged  by AM. But in the {{NodeStatusUpdaterImpl#removeCompletedContainersFromContext}}
call, we didn't check whether the container is really completed or not.  If the container
is stilll running, we shouldn't remove the container from the context



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message