hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohith (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart
Date Fri, 23 May 2014 10:54:01 GMT

     [ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Rohith updated YARN-1366:
-------------------------

    Attachment: YARN-1366.3.patch

I updated patch with below changes.

   bq. Pending releases - AM forgets about a request to release once its made. We will have
to reissue a release request after RM restart 
      FIXED
   bq. Blacklisting has logic in ignoreBlacklisting to ignore it if we cross a threshold.
      FIXED 
   bq. There a few places where the line exceeds 80 chars
      Even I have done format, it is not reducing less than 80char.
       Ex : Line 209 at RMContainerRequestor
            LIne 267 at AMRMClientImpl

Apart from above fix, other changes done are 
* AMRMClient
**  AMRMClient maitaines blacklisted nodes.This will be sent back to RM resync.
**  Added test for checking functionality.

* MapReduce
** Added test applying yarn-1365 patch. To run this test, it is required to have patch for
yarn-1365

Please review the patch

> ApplicationMasterService should Resync with the AM upon allocate call after restart
> -----------------------------------------------------------------------------------
>
>                 Key: YARN-1366
>                 URL: https://issues.apache.org/jira/browse/YARN-1366
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Rohith
>         Attachments: YARN-1366.1.patch, YARN-1366.2.patch, YARN-1366.3.patch, YARN-1366.patch,
YARN-1366.prototype.patch, YARN-1366.prototype.patch
>
>
> The ApplicationMasterService currently sends a resync response to which the AM responds
by shutting down. The AM behavior is expected to change to calling resyncing with the RM.
Resync means resetting the allocate RPC sequence number to 0 and the AM should send its entire
outstanding request to the RM. Note that if the AM is making its first allocate call to the
RM then things should proceed like normal without needing a resync. The RM will return all
containers that have completed since the RM last synced with the AM. Some container completions
may be reported more than once.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message