hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1354) Recover applications upon nodemanager restart
Date Thu, 31 Jul 2014 02:53:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-1354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080415#comment-14080415
] 

Junping Du commented on YARN-1354:
----------------------------------

Thanks [~jlowe] for additional explanation which sounds good to me. About changes for writable
credential, I think we already have a similar JIRA in YARN-668 which is talking about TokenIdentifier.
One way is to wrapper it as PB object (keep writable fields as bytes), could be something
like below:
{code}
required bytes credential;
optional new-field 1;  
optional new-field 2;
{code}
The other approach is to keep it as writable but do some extra work in readFields() that handle
new field could be missing case. Any ideas on preference? However, this sounds a little away
from this JIRA, may be YARN-668 is a better place for discussion there?
The patch looks good in overall. Some trivial comments:
{code}
+  public abstract RecoveredApplicationsState loadApplicationsState()
+      throws IOException;
+
+  public abstract void storeApplication(ApplicationId appId,
+      ContainerManagerApplicationProto p) throws IOException;
+
+  public abstract void finishApplication(ApplicationId appId)
+      throws IOException;
+
+  public abstract void removeApplication(ApplicationId appId)
+      throws IOException;
+
{code}
Shall we change the name of finishApplication() to storeFinishedApplication() which sounds
more precisely to actual work in store layer? (just like we use storeApplication() instead
of startApplication()).

> Recover applications upon nodemanager restart
> ---------------------------------------------
>
>                 Key: YARN-1354
>                 URL: https://issues.apache.org/jira/browse/YARN-1354
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.3.0
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>         Attachments: YARN-1354-v1.patch, YARN-1354-v2-and-YARN-1987-and-YARN-1362.patch,
YARN-1354-v3.patch, YARN-1354-v4.patch, YARN-1354-v5.patch
>
>
> The set of active applications in the nodemanager context need to be recovered for work-preserving
nodemanager restart



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message