hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1368) Common work to re-populate containers’ state into scheduler
Date Mon, 05 May 2014 19:35:17 GMT

     [ https://issues.apache.org/jira/browse/YARN-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jian He updated YARN-1368:
--------------------------

    Attachment: YARN-1368.1.patch

Uploaded a new patch.
- AbstractYarnScheduler#recoverContainersOnNode() does the majority of recovery mechanism
which recovers RMContainer, SchedulerNode,Queue. SchedulerApplicationAttempt, appSchedulingInfo
accordingly.
- ResourceTrackerService#handleContainerStatus is not needed anymore, that’s handled in
the common recovery flow.
- Changed RMAppRecoveredTransition to add the current attempt to scheduler.
- Changed a few RMAppAttempt transitions to capture the completed containers that are recovered.
- some modifications in CapacityScheduler to not send unnecessary app_accepted/attempt_added
event to the recovered apps/attempts.

Todo:
- Replace the containerStatus sent across via NM registration with a new object which captures
the resource capability of the container.
-  FSQueue needs to implements its own recoverContainer method

> Common work to re-populate containers’ state into scheduler
> -----------------------------------------------------------
>
>                 Key: YARN-1368
>                 URL: https://issues.apache.org/jira/browse/YARN-1368
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1368.1.patch, YARN-1368.preliminary.patch
>
>
> YARN-1367 adds support for the NM to tell the RM about all currently running containers
upon registration. The RM needs to send this information to the schedulers along with the
NODE_ADDED_EVENT so that the schedulers can recover the current allocation state of the cluster.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message