From yarn-issues-return-27354-apmail-hadoop-yarn-issues-archive=hadoop.apache.org@hadoop.apache.org Mon May 5 23:50:27 2014 Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 419B311F34 for ; Mon, 5 May 2014 23:50:27 +0000 (UTC) Received: (qmail 68990 invoked by uid 500); 5 May 2014 23:50:22 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 68950 invoked by uid 500); 5 May 2014 23:50:21 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 68921 invoked by uid 99); 5 May 2014 23:50:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 05 May 2014 23:50:21 +0000 Date: Mon, 5 May 2014 23:50:20 +0000 (UTC) From: "Karthik Kambatla (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2019) Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13990108#comment-13990108 ] Karthik Kambatla commented on YARN-2019: ---------------------------------------- [~djp] - any particular ideas on how this should behave? > Retrospect on decision of making RM crashed if any exception throw in ZKRMStateStore > ------------------------------------------------------------------------------------ > > Key: YARN-2019 > URL: https://issues.apache.org/jira/browse/YARN-2019 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Junping Du > Priority: Critical > Labels: ha > Attachments: YARN-2019.1-wip.patch > > > Currently, if any abnormal happens in ZKRMStateStore, it will throw a fetal exception to crash RM down. As shown in YARN-1924, it could due to RM HA internal bug itself, but not fatal exception. We should retrospect some decision here as HA feature is designed to protect key component but not disturb it. -- This message was sent by Atlassian JIRA (v6.2#6252)