Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9E62E200C08 for ; Thu, 26 Jan 2017 15:42:56 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9B644160B40; Thu, 26 Jan 2017 14:42:56 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E478A160B33 for ; Thu, 26 Jan 2017 15:42:55 +0100 (CET) Received: (qmail 74099 invoked by uid 500); 26 Jan 2017 14:42:55 -0000 Mailing-List: contact dev-help@nifi.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@nifi.apache.org Delivered-To: mailing list dev@nifi.apache.org Received: (qmail 74078 invoked by uid 99); 26 Jan 2017 14:42:54 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jan 2017 14:42:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 3B0DC1A066E for ; Thu, 26 Jan 2017 14:42:54 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.38 X-Spam-Level: ** X-Spam-Status: No, score=2.38 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id y7Y30YPmp8eg for ; Thu, 26 Jan 2017 14:42:49 +0000 (UTC) Received: from mail-wm0-f50.google.com (mail-wm0-f50.google.com [74.125.82.50]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 453CD5F1F4 for ; Thu, 26 Jan 2017 14:42:49 +0000 (UTC) Received: by mail-wm0-f50.google.com with SMTP id c85so74708428wmi.1 for ; Thu, 26 Jan 2017 06:42:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=Ksy8W9+568gSF8lTQdxW12JRozwZpbBzwRvt6kCP49k=; b=Cv4AJx8/onJLOqnPcQXXnKdESJpY6pdnqsNITqzJbVZo85kto+ZWXuFK9yDIVzcNth 925Qd7l+mijw8AbU2nTUO+CObqES/A6Rb0DzFtrlct88DpjLR6rvxK4JjUzc6EZqlMix Z8vM44tvj5F76RR7/DIYfO7j5lOTVPIvT+uOyQrBiTljxT0bVYXeTbUXJWtphWomcKvx fTN+qmC2RrdrpQuCpCNMO8HXa9mVsLcAzJZ9lUv7KKImwYCv4wv1nwWG0iNjckBNJwIC qKJqXNkHtB3pPVg1Y5QRfcIQQEgZ/bDCEpA+E2B/k5dmKmmUcaIgLw4RJiMFLdw862hg /wCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=Ksy8W9+568gSF8lTQdxW12JRozwZpbBzwRvt6kCP49k=; b=uApG/byE2BcVxdNx7UUOLgUa0LMjDplKkLABqcHtOxjmCcj0tSqYw+Pldppx109554 Gtj95wb5OSm8e/6I3uWspI+XG3meUSVyy2Urw4m42FD7bXEFPilQNaGf2xLglFrf2IMB L5fH7OwIocj1oZw9av+ABUSkje0Otr8qxOq4la7ud/vqogsdmijF4KjsLoP1///nPYd0 mbmjGkpd+Ixh5G8N+vwRKDHFaGzYnpFEkH7r668qauaXHuHZG8ufWuC+AJrcarSsZt4O JDoQ4ku/q2KewAy/6bhfwcZsm3Ixyn+R1+mP8Ej8WgxMt9wOgYh2ggdoC5BMGeI4U8hI Ym0g== X-Gm-Message-State: AIkVDXJT3bwsGIOCn9YAzslb1Qpoz4RlbJOPSBGaskpzUpM/qxSDn0TTyqJNS3mKrbbx5Ag93GcoMZXzP5+L7Q== X-Received: by 10.223.176.142 with SMTP id i14mr3546984wra.4.1485441768455; Thu, 26 Jan 2017 06:42:48 -0800 (PST) MIME-Version: 1.0 Received: by 10.194.94.162 with HTTP; Thu, 26 Jan 2017 06:42:08 -0800 (PST) In-Reply-To: References: <1485433570468-14523.post@n7.nabble.com> From: Pierre Villard Date: Thu, 26 Jan 2017 15:42:08 +0100 Message-ID: Subject: Re: Upgrade from 1.0.0 to 1.1.1, cluster config. under heavy load, nodes do not connect To: dev Content-Type: multipart/alternative; boundary=001a11419362f188c40547005c19 archived-at: Thu, 26 Jan 2017 14:42:56 -0000 --001a11419362f188c40547005c19 Content-Type: text/plain; charset=UTF-8 In addition to Mark's comment, you could start NiFi with nifi.flowcontroller.autoResumeState=false in order to have the flow stopped when cluster is started. Once cluster is OK, you can manually start the flow (can also be done via REST API if needed). 2017-01-26 15:16 GMT+01:00 Mark Payne : > Ben, > > NiFi provides an embedded ZooKeeper server for convenience, mostly for > 'testing and evaluation' types of > purposes. For any sort of production or very high-volume flows, I would > strongly encourage you to move ZooKeeper > to its own servers. You will certainly see a lot of problems when trying > to interact with ZooKeeper if the box that > ZooKeeper is running on is under heavy load - either CPU-wise or I/O-wise. > > Thanks > -Mark > > > > > On Jan 26, 2017, at 7:26 AM, bmichaud wrote: > > > > On Monday, I stood up a cluster with the same configuration as was done > > successfully in 1.0.0 a three-server NiFi cluster. Before I started the > > cluster, I cleaned out all zookeeper state and data from the old cluster, > > but kept the same flow intact, connected to Kafka to pull data from a > topic. > > This was a performance environment, and there was heavy load on that > kafka > > topic, so it was immediately busy. > > > > My strong belief is that, due to the volume of data that the flow needed > to > > process during the election process, the election of a coordinator never > > occurred, and, to this day, each node remains disconnected from the > others, > > although they are running independently. > > > > Could this be a defect in NiFi or Zookeeper? What would you suggest that > I > > do to resolve this issue? > > All servers in the cluster are configured in the following manner: > > > > nifi.properties: > > nifi.state.management.embedded.zookeeper.start=true > > nifi.cluster.is.node=true > > nifi.cluster.node.address=server1 > > nifi.zookeeper.connect.string=server1:2181,server2:2181,server3:2181 > > > > zookeeper.properties: > > server.1=server1:2888:3888 > > server.2=server2:2888:3888 > > server.3=server3:2888:3888 > > > > state-management.xml: > > > > zk-provider > > > > org.apache.nifi.controller.state.providers.zookeeper. > ZooKeeperStateProvider > > server1:2181,server2:2181,server3:2181 > > /nifi > > 10 seconds > > Open > > > > > > Let me know if you need additional information, please. > > > > > > > > > > -- > > View this message in context: http://apache-nifi-developer- > list.39713.n7.nabble.com/Upgrade-from-1-0-0-to-1-1-1- > cluster-config-under-heavy-load-nodes-do-not-connect-tp14523.html > > Sent from the Apache NiFi Developer List mailing list archive at > Nabble.com. > > --001a11419362f188c40547005c19--