From users-return-8388-archive-asf-public=cust-asf.ponee.io@nifi.apache.org Tue Apr 3 06:10:46 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 5DF0A180627 for ; Tue, 3 Apr 2018 06:10:45 +0200 (CEST) Received: (qmail 26406 invoked by uid 500); 3 Apr 2018 04:10:43 -0000 Mailing-List: contact users-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@nifi.apache.org Delivered-To: mailing list users@nifi.apache.org Received: (qmail 26396 invoked by uid 99); 3 Apr 2018 04:10:43 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Apr 2018 04:10:43 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 68573C03A2 for ; Tue, 3 Apr 2018 04:10:43 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id EvtLFSqSWvwC for ; Tue, 3 Apr 2018 04:10:42 +0000 (UTC) Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id BB6995F19A for ; Tue, 3 Apr 2018 04:10:41 +0000 (UTC) Received: by mail-wm0-f54.google.com with SMTP id x82so31664434wmg.1 for ; Mon, 02 Apr 2018 21:10:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=lmHRGS/NEGSGUKCHCymnZNPE7hgq7DV04KJDidRyrxs=; b=AMRtk6Uu0JxbHPPA0iAY2GsZ346rmMUEhTRgmuhvCyPzz5R8SJR/s6jE60Gg6n7nZQ DIiRSgiApmfgzatuESucJuCCdf2x75VXJXNZR8JVzK4Wuui/ZDZBdc0zgDreWLTOJHSG chhnZ9upxkY8c1lT4YYUXtfa/WnxQNxgWpqwTZ/SBHAzjAKLifUun3iFkzoXGDpd+5pG noEjib5/9txIEuypPHMJPgL8J/00y9VXddt+wpIGkIxtwuQdmywCFNWnPlzw/BNlyFYK sdRoYnCQy2xKvnkkvPhF1upO4Ka/l/RY3wmxA1fRk/ataLOGXjUwnTT9iDS+WbzgUAW8 /lRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=lmHRGS/NEGSGUKCHCymnZNPE7hgq7DV04KJDidRyrxs=; b=Uio58nY8l4jhUdUldo0ds62oz1EchNlx6SFUfOPsIE79++JzPX2RgBeVpD7/Xmb0C4 OqlS60eH00TFZzCLy8TSYVlasBcejhf7YP3xduRHtLA9Pi7Mez4Kh9yK66gAnRktSG/O HAGViiqFiEWt0ozJHvQ3vl29J1c2SupOSkj+qPz1GcQeiNcKRAAL6xIwlsMrdsjzGtHv XC3v5jRuH1Edcx0YwCC6IVeR9dip4RyGlVkIkg82Px/ug/8RPvEEW8chsy0wUQRZoqRE bVABKvc8aAc7hM5tqF//In7OIiUAAWRRTySKz3Y2vJKfpgOMuppfRnYWexfcU7IhA/Uq Jehg== X-Gm-Message-State: AElRT7FbI64w9USGgpp8RO2iVWKuvFgvAO2c08IVmK/zgUdDo0IN7efK sfl1TEHsgW7QxNy5IITED8c3Jr29S97S3Kttzp4= X-Google-Smtp-Source: AIpwx48RVNDW/vCMIpypLwCbStwiYMmBSprD/WhuuYH+kIPcHmCg1/W3TaxnOp8MyW/28wGFwY2A8xrYsYqKq2KdIOo= X-Received: by 10.80.205.210 with SMTP id h18mr15129756edj.234.1522728635386; Mon, 02 Apr 2018 21:10:35 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Pierre Villard Date: Tue, 03 Apr 2018 04:10:25 +0000 Message-ID: Subject: Re: Killing 'Stuck' Processors without restarting NiFi To: users@nifi.apache.org Content-Type: multipart/alternative; boundary="94eb2c1af84866ce150568e9e3f2" --94eb2c1af84866ce150568e9e3f2 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On my phone right now, can't provide much details, but there will be a way to terminate processors in the next version: https://issues.apache.org/jira/browse/NIFI-4895 Pierre Le mar. 3 avr. 2018 =C3=A0 03:03, Joseph Niemiec a = =C3=A9crit : > Yes that's correct. Is there a way to identify the thread pool name of th= e > stuck ZK process with the dump command? Then it would be possible to use > the Java attach API to find and stop the thread with an agent. > > > https://docs.oracle.com/javase/8/docs/jdk/api/attach/spec/com/sun/tools/a= ttach/VirtualMachine.html > > Or a rest API call to interrupt internal threads safely? > > Sent from cellphone. > > On Mon, Apr 2, 2018, 5:52 PM Jeremy Dyer wrote: > >> Hey Joseph, >> >> I don=E2=80=99t have a sure shot fix but I=E2=80=99m willing to bet this= is the same >> issue we all experience using any zookeeper based system. Phoenix for >> example. In that the real problem is the JVM hangs up trying to communic= ate >> with zookeeper more than the actual underlying system. >> >> Is your Kafka cluster using zookeeper or no? >> >> Sent from my iPhone >> >> > On Apr 2, 2018, at 5:31 PM, Joseph Niemiec >> wrote: >> > >> > Hi all, >> > >> > We have some Kafka Processors that are getting stuck with 2 threads >> always on, even if they get shutdown we can wait hours and they never st= op. >> I have seen this behavior before with HDFS processors but only on secur= e >> kerberos clusters. This cluster is not secure at all. >> > >> > Kafka 0.10 >> > NiFi 1.5.0 (Apache) >> > >> > >> > I know you can do nifi.sh dump to get thread info, but thats not reall= y >> helping us manage the problem. If there was a hard-reset button that did= n't >> involve restarting the entire JVM that would be great... It takes a whil= e >> for our NiFi instances to restart at times and would rather not stop >> everything for a single bad processor... >> > >> > Any tips/recommendations on how we can identify whats really making >> this ConsumeKafka processor stuck? >> > >> > Thanks! >> > >> > -- >> > Joseph >> > --94eb2c1af84866ce150568e9e3f2 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On my phone right now, can't provide much details, bu= t there will be a way to terminate processors in the next version:=C2=A0https://issues.apa= che.org/jira/browse/NIFI-4895

Pierre

Le ma= r. 3 avr. 2018 =C3=A0 03:03, Joseph Niemiec <josephxsxn@gmail.com> a =C3=A9crit=C2=A0:
Yes that's correct. I= s there a way to identify the thread pool name of the stuck ZK process with= the dump command? Then it would be possible to use the Java attach API to = find and stop the thread with an agent.

=C2=A0https://docs.oracle.com/javase/8/docs/jdk/api/attach/sp= ec/com/sun/tools/attach/VirtualMachine.html

=
Or a rest API call to interrupt internal threads sa= fely?=C2=A0

Sen= t from cellphone.

On Mon, Apr 2, 2018, 5:52 PM Jeremy Dyer <jdye64@gmail.com>= ; wrote:
Hey Joseph,

I don=E2=80=99t have a sure shot fix but I=E2=80=99m willing to bet this is= the same issue we all experience using any zookeeper based system. Phoenix= for example. In that the real problem is the JVM hangs up trying to commun= icate with zookeeper more than the actual underlying system.

Is your Kafka cluster using zookeeper or no?

Sent from my iPhone

> On Apr 2, 2018, at 5:31 PM, Joseph Niemiec <josephxsxn= @gmail.com> wrote:
>
> Hi all,
>
> We have some Kafka Processors that are getting stuck with 2 threads al= ways on, even if they get shutdown we can wait hours and they never stop. I= =C2=A0 have seen this behavior before with HDFS processors but only on secu= re kerberos clusters. This cluster is not secure at all.
>
> Kafka 0.10
> NiFi 1.5.0 (Apache)
>
>
> I know you can do nifi.sh dump to get thread info, but thats not reall= y helping us manage the problem. If there was a hard-reset button that didn= 't involve restarting the entire JVM that would be great... It takes a = while for our NiFi instances to restart at times and would rather not stop = everything for a single bad processor...
>
> Any tips/recommendations on how we can identify whats really making th= is ConsumeKafka processor stuck?
>
> Thanks!
>
> --
> Joseph
--94eb2c1af84866ce150568e9e3f2--