hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-556) RM Restart phase 2 - Work preserving restart
Date Sat, 22 Mar 2014 00:34:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943789#comment-13943789
] 

Karthik Kambatla commented on YARN-556:
---------------------------------------

Thanks for posting the design doc, [~bikassaha]. [~adhoot] and I have been working on this
for the past few days towards an initial prototype, so we get a handle on all the items required.

In terms of actual work-items (JIRAs), I wonder if it makes sense to work in a branch. Making
the AM, NM resync changes without the scheduler changes would break things. We can work on
the scheduler changes first, so there is no caller and add resync later, but I suppose that
would make it hard to test outside of unit tests.

Thoughts? 

> RM Restart phase 2 - Work preserving restart
> --------------------------------------------
>
>                 Key: YARN-556
>                 URL: https://issues.apache.org/jira/browse/YARN-556
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: Work Preserving RM Restart.pdf
>
>
> YARN-128 covered storing the state needed for the RM to recover critical information.
This umbrella jira will track changes needed to recover the running state of the cluster so
that work can be preserved across RM restarts.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message