hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chengbing Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2997) NM keeps sending finished containers to RM until app is finished
Date Wed, 31 Dec 2014 02:44:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-2997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261825#comment-14261825
] 

Chengbing Liu commented on YARN-2997:
-------------------------------------

[~jianhe] The containers are not removed in this patch, they are just not reported to RM when
the following conditions are met:
* The application is not finished
* The container was completed and was already in {{recentlyStoppedContainers}}
* It is a normal heartbeat with RM, not after RM restart

Note that the container is not removed from the NM context. In a resync with RM, these completed
applications will still be reported for work-preserving recovery.

> NM keeps sending finished containers to RM until app is finished
> ----------------------------------------------------------------
>
>                 Key: YARN-2997
>                 URL: https://issues.apache.org/jira/browse/YARN-2997
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: Chengbing Liu
>         Attachments: YARN-2997.patch
>
>
> We have seen in RM log a lot of
> {quote}
> INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler: Null
container completed...
> {quote}
> It is caused by NM sending completed containers repeatedly until the app is finished.
On the RM side, the container is already released, hence {{getRMContainer}} returns null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message