hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "lujie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-8703) Localized resource may leak on disk if container is killed while localizing
Date Thu, 30 Aug 2018 09:23:00 GMT

    [ https://issues.apache.org/jira/browse/YARN-8703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597230#comment-16597230
] 

lujie commented on YARN-8703:
-----------------------------

 

Below is the is suspicious code?
{code:java}
if (event.getType() == ResourceEventType.LOCALIZED) {
if (rsrc.getLocalPath() != null) {
try {
stateStore.finishResourceLocalization(user, appId,
buildLocalizedResourceProto(rsrc));
} catch (IOException ioe) {
LOG.error("Error storing resource state for " + rsrc, ioe);
}
} else {
LOG.warn("Resource " + rsrc + " localized without a location");
}
}
{code}
I have a doubt: the log is printed when rsrc.getLocalPath() == null, but how " the resource
bookkeeping is removed" make the src.getLocalPath() == null ?

 

> Localized resource may leak on disk if container is killed while localizing
> ---------------------------------------------------------------------------
>
>                 Key: YARN-8703
>                 URL: https://issues.apache.org/jira/browse/YARN-8703
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Jason Lowe
>            Priority: Major
>
> If a container is killed while localizing then it releases all of its resources.  If
the resource count goes to zero and it is in the DOWNLOADING state then the resource bookkeeping
is removed in the resource tracker.  Shortly afterwards the localizer could heartbeat in and
report the successful localization of the resource that was just removed.  When the LocalResourcesTrackerImpl
receives the LOCALIZED event but does not find the corresponding LocalResource for the event
then it simply logs a "localized without a location" warning.  At that point I think the localized
resource has been leaked on the disk since the NM has removed bookkeeping for the resource
without removing it on disk.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message