hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3410) YARN admin should be able to remove individual application records from RMStateStore
Date Fri, 10 Apr 2015 06:48:12 GMT

    [ https://issues.apache.org/jira/browse/YARN-3410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14489052#comment-14489052
] 

Wangda Tan commented on YARN-3410:
----------------------------------

Hi [~rohithsharma],
Patch generally looks good, two minor comments:
1) {{echo "...Use -remove-application-from-state-store for removing"}} add <appId> after
-remove-application-from-state-store to tell user there's another arg?
2) 
{code}
1203	      } else if (argv.length == 2
1204	          && argv[0].equals("-remove-application-from-stare-store")) {
{code}
You need check argv[0] first, and then check argv.length == 2, "&&" will prevent get
precise error message.

And could you try to run the patch in a local cluster to see if CLI works?

> YARN admin should be able to remove individual application records from RMStateStore
> ------------------------------------------------------------------------------------
>
>                 Key: YARN-3410
>                 URL: https://issues.apache.org/jira/browse/YARN-3410
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager, yarn
>            Reporter: Wangda Tan
>            Assignee: Rohith
>            Priority: Critical
>         Attachments: 0001-YARN-3410-v1.patch
>
>
> When RM state store entered an unexpected state, one example is YARN-2340, when an attempt
is not in final state but app already completed, RM can never get up unless format RMStateStore.
> I think we should support remove individual application records from RMStateStore to
unblock RM admin make choice of either waiting for a fix or format state store.
> In addition, RM should be able to report all fatal errors (which will shutdown RM) when
doing app recovery, this can save admin some time to remove apps in bad state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message