hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore
Date Thu, 02 Apr 2015 05:24:53 GMT

    [ https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392173#comment-14392173

Rohith commented on YARN-3410:

For state store format  in YARN-2131, discussion happened whether to format state using admin
service or resourcemanager start up options [comment link|https://issues.apache.org/jira/browse/YARN-2131?focusedCommentId=14032694&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14032694].
Similarly I am thinking for application state deletion options
# ./yarn resourcemanager -delete-from-state-store app-id OR
# ./yarn rmadmin -delete-from-state-store app-id
1st choice is pretty staight forward deletion neverthless of app state is finished or running.
I would like to choose 2nd option.

> YARN admin should be able to remove individual application records from RMStateStore
> ------------------------------------------------------------------------------------
>                 Key: YARN-3410
>                 URL: https://issues.apache.org/jira/browse/YARN-3410
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager, yarn
>            Reporter: Wangda Tan
>            Assignee: Rohith
>            Priority: Critical
> When RM state store entered an unexpected state, one example is YARN-2340, when an attempt
is not in final state but app already completed, RM can never get up unless format RMStateStore.
> I think we should support remove individual application records from RMStateStore to
unblock RM admin make choice of either waiting for a fix or format state store.
> In addition, RM should be able to report all fatal errors (which will shutdown RM) when
doing app recovery, this can save admin some time to remove apps in bad state.

This message was sent by Atlassian JIRA

View raw message