hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Gong (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-2612) Some completed containers are not reported to NM
Date Fri, 26 Sep 2014 09:19:33 GMT

     [ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Jun Gong updated YARN-2612:
    Attachment: YARN-2612.2.patch

Also change Capacity and FIFO Scheduler.

> Some completed containers are not reported to NM
> ------------------------------------------------
>                 Key: YARN-2612
>                 URL: https://issues.apache.org/jira/browse/YARN-2612
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Jun Gong
>             Fix For: 2.6.0
>         Attachments: YARN-2612.2.patch, YARN-2612.patch
> In YARN-1372, NM will report completed containers to RM until it gets ACK from RM.  If
AM does not call allocate, which means that AM does not ack RM, RM will not ack NM. We([~chenchun])
have observed these two cases when running Mapreduce task 'pi':
> 1) RM sends completed containers to AM. After receiving it, AM thinks it has done the
work and does not need resource, so it does not call allocate.
> 2) When AM finishes, it could not ack to RM because AM itself has not finished yet.
> In order to solve this problem, we have two solutions:
> 1) When RMAppAttempt call FinalTransition, it means AppAttempt finishes, then RM could
send this AppAttempt's completed containers to NM.
> 2) In  FairScheduler#nodeUpdate, if completed containers sent by NM does not have corresponding
RMContainer, RM just ack it to NM.
> We prefer to solution 2 because it is more clear and concise. However RM might ack same
completed containers to NM many times.

This message was sent by Atlassian JIRA

View raw message