Return-Path: X-Original-To: apmail-tez-issues-archive@minotaur.apache.org Delivered-To: apmail-tez-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B35451097F for ; Mon, 9 Feb 2015 20:07:36 +0000 (UTC) Received: (qmail 82099 invoked by uid 500); 9 Feb 2015 20:07:36 -0000 Delivered-To: apmail-tez-issues-archive@tez.apache.org Received: (qmail 81959 invoked by uid 500); 9 Feb 2015 20:07:36 -0000 Mailing-List: contact issues-help@tez.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tez.apache.org Delivered-To: mailing list issues@tez.apache.org Received: (qmail 81899 invoked by uid 99); 9 Feb 2015 20:07:36 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Feb 2015 20:07:36 +0000 Date: Mon, 9 Feb 2015 20:07:36 +0000 (UTC) From: "Hitesh Shah (JIRA)" To: issues@tez.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (TEZ-2064) SessionNotRunning Exception not thrown is all cases MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/TEZ-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14312773#comment-14312773 ] Hitesh Shah commented on TEZ-2064: ---------------------------------- +1 for both (1) and (2). And yes, not sure on (3) as it does add additional overhead which in most cases should likely not happen and also, it will be eventually caught when the actual submission takes place. > SessionNotRunning Exception not thrown is all cases > --------------------------------------------------- > > Key: TEZ-2064 > URL: https://issues.apache.org/jira/browse/TEZ-2064 > Project: Apache Tez > Issue Type: Bug > Reporter: Jonathan Eagles > Priority: Critical > > Hive handles SessionNotRunning during submitDAG() and restarts the tez-session > if it receives one. In YHIVE-15, we did not receive that and the query failed. In some scenarios the Application will fall out of the RM's knowledge and a ApplicationNotFound exception is received instead. > Here are my asks. > 1. TezClient.submitDAG()/stop() should return SessionNotRunning exception if > application is expired. Basically any API which currently returns > SessionNotRunning should handle the app-not-found scenario. > 2. It would help if TezClient.getAppMasterStatus() can return > TezAppMasterStatus.SHUTDOWN if tez-session-application does not exist in RM. > That way, as a precaution, applications could check before submitting DAG's. > 3. I think it might be better if verifySessionStateForSubmission() checks the > app Status every time instead of checking sessionStarted. I am not sure about > side-effects, but will leave that to your decision. > If 3 takes time, we can pursue that later. It would really help to get 1 & 2 in > the next tez release, especially for busy grids. -- This message was sent by Atlassian JIRA (v6.3.4#6332)