Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 46020200B7E for ; Tue, 23 Aug 2016 08:17:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 44DA3160AB3; Tue, 23 Aug 2016 06:17:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A9DBB160AC3 for ; Tue, 23 Aug 2016 08:17:22 +0200 (CEST) Received: (qmail 10246 invoked by uid 500); 23 Aug 2016 06:17:21 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 10174 invoked by uid 99); 23 Aug 2016 06:17:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Aug 2016 06:17:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 568572C0156 for ; Tue, 23 Aug 2016 06:17:21 +0000 (UTC) Date: Tue, 23 Aug 2016 06:17:21 +0000 (UTC) From: "Siddharth Seth (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-14608) LLAP: ZK registry doesn't remove nodes on kill MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 23 Aug 2016 06:17:23 -0000 [ https://issues.apache.org/jira/browse/HIVE-14608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15432213#comment-15432213 ] Siddharth Seth commented on HIVE-14608: --------------------------------------- New tasks should not be scheduled on them - because scheduling is based off of the activeInstanceSet. For existing tasks, these will eventually timeout after communication failures. Acting on these actively to disable the node needs to be done is a simple code change. However it needs testing. Need to get to writing an in-proc controllable llap test setup. {code} getContext().nodesUpdate(List) {code} > LLAP: ZK registry doesn't remove nodes on kill > ----------------------------------------------- > > Key: HIVE-14608 > URL: https://issues.apache.org/jira/browse/HIVE-14608 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Prasanth Jayachandran > > ...and presumably doesn't disable them for scheduling. I haven't looked in detail though, I just see some harmless killed tasks in queries after I kill some LLAP nodes manually between queries > {noformat} > public void workerNodeRemoved(ServiceInstance serviceInstance) { > // FIXME: disabling this for now > // instanceToNodeMap.remove(serviceInstance.getWorkerIdentity()); > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)