Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3E903114A4 for ; Tue, 13 May 2014 15:26:17 +0000 (UTC) Received: (qmail 57074 invoked by uid 500); 13 May 2014 11:26:17 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 56964 invoked by uid 500); 13 May 2014 11:26:17 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 56798 invoked by uid 99); 13 May 2014 11:26:17 -0000 Received: from Unknown (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 May 2014 11:26:17 +0000 Date: Tue, 13 May 2014 11:26:17 +0000 (UTC) From: "Rohith (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-1366) ApplicationMasterService should Resync with the AM upon allocate call after restart MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rohith updated YARN-1366: ------------------------- Attachment: YARN-1366.1.patch I updated the patch for follwing changes in AMRMClient(MapReduce is not considered here) 1. On Resync from RM, reset lastResponseId and re register with RM. 2. Add back ResourceRequest for last allocate request. 3. Followed by 1 and 2, AMRMClient continue heatbeat Patch does not contain test, and I will write test in next patches. Please review initiall patch ,does this satisfy task expectations. Work Item to be decided. 1. On resync, last ResourceRequest are added back to ask send back again heartbeat. Here my doubt is, what about old asks which are sent earlier heartbeat but not allocated? Earlier requests can be populated using remoteRequestTable. 2. For MapReduce changes, should be handled in this jira? Current behaviour of AMs treats RESYNC and SHUTDOWN as same.It would be very useful if resync and shutdown commands are issued separately by application master service. > ApplicationMasterService should Resync with the AM upon allocate call after restart > ----------------------------------------------------------------------------------- > > Key: YARN-1366 > URL: https://issues.apache.org/jira/browse/YARN-1366 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Bikas Saha > Assignee: Rohith > Attachments: YARN-1366.1.patch, YARN-1366.patch, YARN-1366.prototype.patch, YARN-1366.prototype.patch > > > The ApplicationMasterService currently sends a resync response to which the AM responds by shutting down. The AM behavior is expected to change to calling resyncing with the RM. Resync means resetting the allocate RPC sequence number to 0 and the AM should send its entire outstanding request to the RM. Note that if the AM is making its first allocate call to the RM then things should proceed like normal without needing a resync. The RM will return all containers that have completed since the RM last synced with the AM. Some container completions may be reported more than once. -- This message was sent by Atlassian JIRA (v6.2#6252)