Return-Path: X-Original-To: apmail-hive-dev-archive@www.apache.org Delivered-To: apmail-hive-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6116B18F8B for ; Fri, 11 Dec 2015 23:22:47 +0000 (UTC) Received: (qmail 58636 invoked by uid 500); 11 Dec 2015 23:22:46 -0000 Delivered-To: apmail-hive-dev-archive@hive.apache.org Received: (qmail 58546 invoked by uid 500); 11 Dec 2015 23:22:46 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 58520 invoked by uid 99); 11 Dec 2015 23:22:46 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Dec 2015 23:22:46 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A02FB2C03DA for ; Fri, 11 Dec 2015 23:22:46 +0000 (UTC) Date: Fri, 11 Dec 2015 23:22:46 +0000 (UTC) From: "Prasanth Jayachandran (JIRA)" To: dev@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HIVE-12659) LLAP should detect all nodes down state and stop issuing queries MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Prasanth Jayachandran created HIVE-12659: -------------------------------------------- Summary: LLAP should detect all nodes down state and stop issuing queries Key: HIVE-12659 URL: https://issues.apache.org/jira/browse/HIVE-12659 Project: Hive Issue Type: Bug Components: llap Affects Versions: 2.0.0, 2.1.0 Reporter: Prasanth Jayachandran I ran a simple query with 1 task in llap and for some reason llap daemon was down (all nodes down scenario). But queries got submitted repeatedly to the daemon and got killed by tez AM infinitely. Single task got killed over 20 times and had to ctrl + c. We need to detect all nodes down scenarios (using Zookeeper?) and notify the client of the scenario and fail early. -- This message was sent by Atlassian JIRA (v6.3.4#6332)