hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3449) Recover appTokenKeepAliveMap upon nodemanager restart
Date Mon, 06 Apr 2015 18:45:12 GMT

    [ https://issues.apache.org/jira/browse/YARN-3449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14481606#comment-14481606

Junping Du commented on YARN-3449:

bq. Again when the NM re-registers it will report all active applications, and the RM will
attempt to correct this on the next heartbeat. 
You are right, [~jlowe]. I think I could miss CLEANUP_APP would be resent in node reconnection
(totally forget it for some strange reason). So that shouldn't be a problem. BTW, I didn't
see any actual failure on this, so I will resolve it as invalid.

> Recover appTokenKeepAliveMap upon nodemanager restart
> -----------------------------------------------------
>                 Key: YARN-3449
>                 URL: https://issues.apache.org/jira/browse/YARN-3449
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 2.6.0, 2.7.0
>            Reporter: Junping Du
>            Assignee: Junping Du
> appTokenKeepAliveMap in NodeStatusUpdaterImpl is used to keep application alive after
application is finished but NM still need app token to do log aggregation (when enable security
and log aggregation). 
> The applications are only inserted into this map when receiving getApplicationsToCleanup()
from RM heartbeat response. And RM only send this info one time in RMNodeImpl.updateNodeHeartbeatResponseForCleanup().
NM restart work preserving should put appTokenKeepAliveMap into NMStateStore and get recovered
after restart. Without doing this, RM could terminate application earlier, so log aggregation
could be failed if security is enabled.

This message was sent by Atlassian JIRA

View raw message