[ https://issues.apache.org/jira/browse/YARN-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532597#comment-13532597
]
Bikas Saha commented on YARN-230:
---------------------------------
A note on testing. Among other test changes, there is a functional test that takes the RM
through different scenarios of applications being stored, run and re-run as well as nodes
heartbeating and reconnecting on restart. I have manually tested the scenarios on a single
node setup with ZK and FileSystem implementations of the RMStateStore.
Arinto has run the code on a cluster using ZK for storage and verified that it works as expected.
https://issues.apache.org/jira/browse/YARN-128?focusedCommentId=13505615&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13505615
> Make changes for RM restart phase 1
> -----------------------------------
>
> Key: YARN-230
> URL: https://issues.apache.org/jira/browse/YARN-230
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: resourcemanager
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: PB-impl.patch, Recovery.patch, Store.patch, Test.patch, YARN-230.1.patch,
YARN-230.4.patch, YARN-230.5.patch
>
>
> As described in YARN-128, phase 1 of RM restart puts in place mechanisms to save application
state and read them back after restart. Upon restart, the NM's are asked to reboot and the
previously running AM's are restarted.
> After this is done, RM HA and work preserving restart can continue in parallel. For more
details please refer to the design document in YARN-128
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
|