hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart
Date Wed, 21 May 2014 03:13:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004279#comment-14004279
] 

Rohith commented on YARN-1366:
------------------------------

bq. Catching incorrect unregistration before registration should have always been there. Is
this a regression in the patch or an existing bug.
This is not bug in existing code. Unregister in ApplicationMasterService check whether app
is registered.Otherwise throw InvalidApplicationMasterRequestException

bq. Should we consider the possibility of allowing unregister without register?
Yes, becaue for differentiating 
         last heatbeat sent by AM to RM,RM restarted, and unregistering application VS  application
master sending unregister without registering

> ApplicationMasterService should Resync with the AM upon allocate call after restart
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-1366
>                 URL: https://issues.apache.org/jira/browse/YARN-1366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Rohith
>         Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.patch, YARN-1366.prototype.patch,
YARN-1366.prototype.patch
>
>
> The ApplicationMasterService currently sends a resync response to which the AM responds
by shutting down. The AM behavior is expected to change to calling resyncing with the RM.
Resync means resetting the allocate RPC sequence number to 0 and the AM should send its entire
outstanding request to the RM. Note that if the AM is making its first allocate call to the
RM then things should proceed like normal without needing a resync. The RM will return all
containers that have completed since the RM last synced with the AM. Some container completions
may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message