From issues-return-165172-archive-asf-public=cust-asf.ponee.io@hive.apache.org Thu Aug 15 18:49:02 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 10EDB180651 for ; Thu, 15 Aug 2019 20:49:01 +0200 (CEST) Received: (qmail 81307 invoked by uid 500); 15 Aug 2019 18:49:01 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 81298 invoked by uid 99); 15 Aug 2019 18:49:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Aug 2019 18:49:01 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 8843AE3032 for ; Thu, 15 Aug 2019 18:49:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 2A5872778E for ; Thu, 15 Aug 2019 18:49:00 +0000 (UTC) Date: Thu, 15 Aug 2019 18:49:00 +0000 (UTC) From: "Oliver Draese (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-22113) Prevent LLAP shutdown on AMReporter related RuntimeException MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-22113?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1690= 8405#comment-16908405 ]=20 Oliver Draese commented on HIVE-22113: -------------------------------------- Created follow-up cleanup action as HIVE-22117 > Prevent LLAP shutdown on AMReporter related RuntimeException > ------------------------------------------------------------ > > Key: HIVE-22113 > URL: https://issues.apache.org/jira/browse/HIVE-22113 > Project: Hive > Issue Type: Bug > Components: llap > Affects Versions: 3.1.1 > Reporter: Oliver Draese > Assignee: Oliver Draese > Priority: Major > Labels: llap > Attachments: HIVE-22113.1.patch, HIVE-22113.2.patch, HIVE-22113.p= atch > > > If a task attempt cannot be removed from AMReporter (i.e. task attempt wa= s not found), the AMReporter throws a RuntimeException. This exception is n= ot caught and trickles up, causing an LLAP shutdown: > {{2019-08-08T23:34:39,748 ERROR [Wait-Queue-Scheduler-0 ()]&n= bsporg.apache.hadoop.hive.llap.daemon.impl.LlapDaemon: Thread Threa= d[Wait-Queue-Scheduler-0,5,main] threw an Exception. Shutti= ng down now...}}{{java.lang.RuntimeException: attempt_156352887= 7295_18872_3728_01_000003_0 was not registered and coul= dn't be removed}}{{=C2=A0=C2=A0=C2=A0=C2=A0at org.apache.hadoop= .hive.llap.daemon.impl.AMReporter$AMNodeInfo.removeTaskAttempt(AMReporter.j= ava:524) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{= {=C2=A0=C2=A0=C2=A0=C2=A0at org.apache.hadoop.hive.llap.daemon.impl.AMR= eporter.unregisterTask(AMReporter.java:243) ~[hive-llap-server-3.1.0.3.= 1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at org.apac= he.hadoop.hive.llap.daemon.impl.TaskRunnerCallable.killTask(TaskRunnerCalla= ble.java:384) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-= 1]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at org.apache.hadoop.hive.llap.daemon.imp= l.TaskExecutorService.handleScheduleAttemptedRejection(TaskExecutorService.= java:739) ~[hive-llap-server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}= {{=C2=A0=C2=A0=C2=A0=C2=A0at org.apache.hadoop.hive.llap.daemon.impl.Ta= skExecutorService.access$1100(TaskExecutorService.java:91) ~[hive-llap-= server-3.1.0.3.1.0.103-1.jar:3.1.0.3.1.0.103-1]}}{{=C2=A0=C2=A0=C2=A0=C2=A0= at org.apache.hadoop.hive.llap.daemon.impl.TaskExecutorService$WaitQueu= eWorker.run(TaskExecutorService.java:396) ~[hive-llap-server-3.1.0.3.1.= 0.103-1.jar:3.1.0.3.1.0.103-1]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at java.util.= concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0= _161]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at com.google.common.util.concurrent.T= rustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(= TrustedListenableFutureTask.java:108) [hive-exec-3.1.0.3.1.0.103-1.jar:= 3.1.0-SNAPSHOT]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at com.google.common.util.co= ncurrent.InterruptibleTask.run(InterruptibleTask.java:41) [hive-exec-3.= 1.0.3.1.0.103-1.jar:3.1.0-SNAPSHOT]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at com.g= oogle.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListena= bleFutureTask.java:77) [hive-exec-3.1.0.3.1.0.103-1.jar:3.1.0-SNAPSHOT]= }}{{=C2=A0=C2=A0=C2=A0=C2=A0at java.util.concurrent.ThreadPoolExecutor.= runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]}}{{=C2=A0=C2=A0= =C2=A0=C2=A0at java.util.concurrent.ThreadPoolExecutor$Worker.run(Threa= dPoolExecutor.java:624) [?:1.8.0_161]}}{{=C2=A0=C2=A0=C2=A0=C2=A0at&nbs= pjava.lang.Thread.run(Thread.java:748) [?:1.8.0_161]}} -- This message was sent by Atlassian JIRA (v7.6.14#76016)