hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4924) NM recovery race can lead to container not cleaned up
Date Thu, 14 Apr 2016 19:05:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-4924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241734#comment-15241734
] 

Jason Lowe commented on YARN-4924:
----------------------------------

bq. leveldbIterator may also throws DBException, yes?
Yes, if the constructor throws.  That's a bug in LeveldbIterator, since the whole point of
that class is to wrap the underlying iterators and translate the runtime DBExceptions into
IOExceptions.  Arguably we should do the same for DB so clients don't have to keep catching
and translating DBException, but that's for another JIRA.

+1 for the latest patch.  Committing this.

> NM recovery race can lead to container not cleaned up
> -----------------------------------------------------
>
>                 Key: YARN-4924
>                 URL: https://issues.apache.org/jira/browse/YARN-4924
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 3.0.0, 2.7.2
>            Reporter: Nathan Roberts
>            Assignee: sandflee
>         Attachments: YARN-4924.01.patch, YARN-4924.02.patch, YARN-4924.03.patch, YARN-4924.04.patch,
YARN-4924.05.patch
>
>
> It's probably a small window but we observed a case where the NM crashed and then a container
was not properly cleaned up during recovery.
> I will add details in first comment.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message