From yarn-issues-return-141995-archive-asf-public=cust-asf.ponee.io@hadoop.apache.org Tue Apr 10 16:36:08 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id BAB02180718 for ; Tue, 10 Apr 2018 16:36:07 +0200 (CEST) Received: (qmail 35344 invoked by uid 500); 10 Apr 2018 14:36:06 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 35333 invoked by uid 99); 10 Apr 2018 14:36:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2018 14:36:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 441ACC16F4 for ; Tue, 10 Apr 2018 14:36:06 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -109.501 X-Spam-Level: X-Spam-Status: No, score=-109.501 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id unX-vz80uTub for ; Tue, 10 Apr 2018 14:36:01 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 327FF5FC23 for ; Tue, 10 Apr 2018 14:36:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 73C69E0BD2 for ; Tue, 10 Apr 2018 14:36:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2F897241C6 for ; Tue, 10 Apr 2018 14:36:00 +0000 (UTC) Date: Tue, 10 Apr 2018 14:36:00 +0000 (UTC) From: "Billie Rinaldi (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-7984) Delete registry entries from ZK on ServiceClient stop and clean up stop/destroy behavior MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-7984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Billie Rinaldi updated YARN-7984: --------------------------------- Description: The service records written to the registry are removed by ServiceClient on a destroy call, but not on a stop call. The service AM does have some code to clean up the registry entries when component instances are stopped, but if the AM is killed before it has a chance to perform the cleanup, these entries will be left in ZooKeeper. It would be better to clean these up in the stop call, so that RegistryDNS does not provide lookups for containers that don't exist. Additional stop/destroy behavior improvements include fixing errors / unexpected behavior related to: * destroying a saved (not launched or started) service * destroying a stopped service * destroying a destroyed service * returning proper exit codes for destroy failures * performing other client operations on saved services (fixing NPEs) was: The service records written to the registry are removed by ServiceClient on a destroy call, but not on a stop call. The service AM does have some code to clean up the registry entries when component instances are stopped, but if the AM is killed before it has a chance to perform the cleanup, these entries will be left in ZooKeeper. It would be better to clean these up in the stop call, so that RegistryDNS does not provide lookups for containers that don't exist. Additional stop/destroy behavior improvements include: * destroying a saved (not launched or started) service * destroying a stopped service * destroying a destroyed service * > Delete registry entries from ZK on ServiceClient stop and clean up stop/destroy behavior > ---------------------------------------------------------------------------------------- > > Key: YARN-7984 > URL: https://issues.apache.org/jira/browse/YARN-7984 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-native-services > Reporter: Billie Rinaldi > Assignee: Billie Rinaldi > Priority: Critical > Attachments: YARN-7984.1.patch, YARN-7984.2.patch > > > The service records written to the registry are removed by ServiceClient on a destroy call, but not on a stop call. The service AM does have some code to clean up the registry entries when component instances are stopped, but if the AM is killed before it has a chance to perform the cleanup, these entries will be left in ZooKeeper. It would be better to clean these up in the stop call, so that RegistryDNS does not provide lookups for containers that don't exist. > Additional stop/destroy behavior improvements include fixing errors / unexpected behavior related to: > * destroying a saved (not launched or started) service > * destroying a stopped service > * destroying a destroyed service > * returning proper exit codes for destroy failures > * performing other client operations on saved services (fixing NPEs) -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org