hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-230) Make changes for RM restart phase 1
Date Wed, 12 Dec 2012 22:31:22 GMT

    [ https://issues.apache.org/jira/browse/YARN-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13530439#comment-13530439

Bikas Saha commented on YARN-230:

Thanks for the reviews guys!

Sorry for the delayed response. I was travelling in between and lost your comment.
I have removed the removeApplicationAttempt() method from the API.
I agree the retry attempt counting can be handled in YARN-218.

I like the idea of using a NullStore and simplifying the RMAppAttempt state machine. Done.
I was thinking of some of the other refactorings myself for later but now I have incorporated
them into the current patch.
RMStateStore is not a service and hence I have not renamed the closeInternal() and initInternal().
The RM now dies when there is a store related error. I am using ExitUtil from Common (same
as the NameNode code).

Please comment if there are any more issues.
> Make changes for RM restart phase 1
> -----------------------------------
>                 Key: YARN-230
>                 URL: https://issues.apache.org/jira/browse/YARN-230
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: PB-impl.patch, Recovery.patch, Store.patch, Test.patch, YARN-230.1.patch,
YARN-230.4.patch, YARN-230.5.patch
> As described in YARN-128, phase 1 of RM restart puts in place mechanisms to save application
state and read them back after restart. Upon restart, the NM's are asked to reboot and the
previously running AM's are restarted.
> After this is done, RM HA and work preserving restart can continue in parallel. For more
details please refer to the design document in YARN-128

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message