Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4193E200498 for ; Tue, 29 Aug 2017 21:23:07 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 401A11676BF; Tue, 29 Aug 2017 19:23:07 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 875111676C0 for ; Tue, 29 Aug 2017 21:23:06 +0200 (CEST) Received: (qmail 80974 invoked by uid 500); 29 Aug 2017 19:23:05 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 80962 invoked by uid 99); 29 Aug 2017 19:23:05 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Aug 2017 19:23:05 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 446FD188C0F for ; Tue, 29 Aug 2017 19:23:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id Qr7QmQPsRWP6 for ; Tue, 29 Aug 2017 19:23:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id A2BF85F6D3 for ; Tue, 29 Aug 2017 19:23:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id DB9F9E099C for ; Tue, 29 Aug 2017 19:23:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2B4A92414F for ; Tue, 29 Aug 2017 19:23:00 +0000 (UTC) Date: Tue, 29 Aug 2017 19:23:00 +0000 (UTC) From: "Gour Saha (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-5855) DELETE call sometimes returns success when app is not deleted MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 29 Aug 2017 19:23:07 -0000 [ https://issues.apache.org/jira/browse/YARN-5855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16145949#comment-16145949 ] Gour Saha commented on YARN-5855: --------------------------------- DELETE being synchronous makes sense to me too. Only one point to think on is - since the intent of the app-owner is to destroy the app, he/she probably doesn't care if the app is stopped elegantly or not (as long as log-aggregation is done successfully after the app dies). Now in worst case the API can take up to 10 secs to respond. I think 10 secs is too high. Do you think we should reduce it? My suggestion is 2 secs. > DELETE call sometimes returns success when app is not deleted > ------------------------------------------------------------- > > Key: YARN-5855 > URL: https://issues.apache.org/jira/browse/YARN-5855 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Billie Rinaldi > Assignee: Gour Saha > > Looking into this issue with [~gsaha], we noticed that multiple things can contribute to an app continuing to run after a DELETE call, which consists of a stop and a destroy operation. One problem is that the stop call is asynchronous unless a force flag is set. Without the force flag, a message is sent to the AM and success is returned, and with the flag yarnClient.killRunningApplication is called. (There is also an option to wait for a fixed amount of time for the app to stop before returning, but DELETE is not setting this option and force is preferable in this case.) The other issue is that the destroy operation is attempted in a loop, but if the number of retries is exceeded the call returns a 204 response. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org