Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 366BD10956 for ; Fri, 10 Jan 2014 18:35:17 +0000 (UTC) Received: (qmail 42241 invoked by uid 500); 10 Jan 2014 18:35:03 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 42205 invoked by uid 500); 10 Jan 2014 18:35:00 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 42175 invoked by uid 99); 10 Jan 2014 18:34:58 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Jan 2014 18:34:58 +0000 Date: Fri, 10 Jan 2014 18:34:58 +0000 (UTC) From: "Xuan Gong (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1410) Handle client failover during 2 step client API's like app submission MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13868096#comment-13868096 ] Xuan Gong commented on YARN-1410: --------------------------------- One way to handle the last corner case can be: * This can be only used when HA is enabled * When RM assigns a new Application Id, it will create a unique ID(Using the UUID class), too, * When YarnClient calls createApplication(), it will create ApplicationSubmissionContext, and set ApplicationId, UniqueID. * At ClientRMService#submitApplication, we can check UniqueID if the applicationIds are the same. If the uniqueIds are the same, we can say that we submit the same application which probably is caused by failOver (The last corner case happens). Otherwise, we can reject it. Using this approach, we need to change the protocol buffer object, such as ASC and GetNewApplicationResponse. > Handle client failover during 2 step client API's like app submission > --------------------------------------------------------------------- > > Key: YARN-1410 > URL: https://issues.apache.org/jira/browse/YARN-1410 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Bikas Saha > Assignee: Xuan Gong > Attachments: YARN-1410-outline.patch, YARN-1410.1.patch > > Original Estimate: 48h > Remaining Estimate: 48h > > App submission involves > 1) creating appId > 2) using that appId to submit an ApplicationSubmissionContext to the user. > The client may have obtained an appId from an RM, the RM may have failed over, and the client may submit the app to the new RM. > Since the new RM has a different notion of cluster timestamp (used to create app id) the new RM may reject the app submission resulting in unexpected failure on the client side. > The same may happen for other 2 step client API operations. -- This message was sent by Atlassian JIRA (v6.1.5#6160)