Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id D0941200B9D for ; Thu, 13 Oct 2016 22:12:29 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id CF1DA160AE4; Thu, 13 Oct 2016 20:12:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 22437160AD2 for ; Thu, 13 Oct 2016 22:12:28 +0200 (CEST) Received: (qmail 58193 invoked by uid 500); 13 Oct 2016 20:12:23 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 58163 invoked by uid 99); 13 Oct 2016 20:12:22 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2016 20:12:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 6237718060C for ; Thu, 13 Oct 2016 20:12:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id vUS--VEcA0Ls for ; Thu, 13 Oct 2016 20:12:20 +0000 (UTC) Received: from mail-oi0-f54.google.com (mail-oi0-f54.google.com [209.85.218.54]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 495C75F1F3 for ; Thu, 13 Oct 2016 20:12:19 +0000 (UTC) Received: by mail-oi0-f54.google.com with SMTP id m72so112387636oik.3 for ; Thu, 13 Oct 2016 13:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=7ydQzqXQP3LG06f+e/k+HT8L7MRR3B2YbzivoEJtGCk=; b=tpF9Iscle/ZEazl1EvXN8p/h8L+Sbl4GGtsQBMFc8ljmy+659g915C624NLL/jLyOB SZMtDdCwNyGvThg6Hb0w1Jw6HRLdR1cVN2fUihEDLcj5ZSCwUewD9PpSIX81X/IbQH3h spGASWOxZVXdAOn9cyLphAQzc7F9RDfTplnNccq9HprDO0ywAZDVl579P08Ex7w7NFJO lxxeSvCKgT8cKbNOgL1Ux0tAdGnC+aeYgfSENykU67x7AsjD3SwPkCioMTr7xza1JYod TV6YAcLxtt0Nrd+3MU7tOyESMo1UUFOmcJzHY9Dn37wQWxtMgipY+3RW4eMo6glBn7Fw 2ahQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=7ydQzqXQP3LG06f+e/k+HT8L7MRR3B2YbzivoEJtGCk=; b=bEso29qSwI336uXBFRls0A5fhRVthoEzjgFJ+Y3E2qNwmV78UZbdJkHm+ndgV/Gn7Z 05eed+C+5lFCR/lMr6ppL6e2Vglzn9j0M7VIomU9kl1RDMN0ZTo5bIRsgQ60pmGmx9Xu gyKaa9NYl+dNiYsjI7FriVStUOAZ54gM7KHpf+RO2NQh2D6Z2EPo1GbjM6ngFre4gNgP A0MSKnCyCA56YFqOQPw7z7SCmA1KWvWYx+pmiS+Xt4VNdM85Jr4bWjcWgKu3/2r3ogVy 2DGdVuF5HVGtR3uibXaoxYgDafkpny95u1TRAiMJs2fEt7j+NXyPLIhCgWlqPhQ0sYam LTIw== X-Gm-Message-State: AA6/9RmLhwXI0HWDqbdEiQA2xbuU0tPLVRq9Y8hP4vO04zoXOT6TBV8yrFsnlcv63cIO9E+y223GYc6bD6QnGA== X-Received: by 10.202.240.11 with SMTP id o11mr5751717oih.23.1476389537999; Thu, 13 Oct 2016 13:12:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.202.45.211 with HTTP; Thu, 13 Oct 2016 13:12:17 -0700 (PDT) In-Reply-To: References: From: Edward Ribeiro Date: Thu, 13 Oct 2016 17:12:17 -0300 Message-ID: Subject: Re: outstandingChanges queue grows without bound To: UserZooKeeper Content-Type: multipart/alternative; boundary=94eb2c096f3ef671ce053ec4b967 archived-at: Thu, 13 Oct 2016 20:12:30 -0000 --94eb2c096f3ef671ce053ec4b967 Content-Type: text/plain; charset=UTF-8 Very interesting patch, Mike. I've left a couple of review comments (hope you don't mind) in the https://github.com/msolo/zookeeper/commit/75da352d506c2e3b0001d28acc058c 422b3c8f0c commit. :) Cheers, Eddie On Thu, Oct 13, 2016 at 4:06 PM, Arshad Mohammad < arshad.mohammad.k@gmail.com> wrote: > Hi Mike > I also faced same issue. There is test patch in ZOOKEEPER-2570 which can be > used to quickly check performance gains in each modification. Hope it is > useful. > > -Arshad > > On Thu, Oct 13, 2016 at 1:27 AM, Mike Solomon wrote: > > > I've been performance testing 3.5.2 and hit an interesting unavailability > > issue. > > > > When there server is very busy (64k connections, 16k writes per > > second) the leader can get busy enough that connections get throttled. > > Enough throttling causes sessions to expire. As sessions expire, the > > CPU consumption rises and the quorum is effectively unavailable. > > Interestingly, if you shut down all the clients, the quorum won't heal > > for nearly 10 minutes. > > > > The issue is that the outstandingChanges queue has 250k items in it > > and the closeSession code scans this linearly under a lock. Replacing > > the linear scan with a hash table lookup improves this, but likely the > > real solution is some backpressure on clients as a result of an > > oversized outstandingChanges queue. > > > > Here is a sample fix: > > https://github.com/msolo/zookeeper/commit/75da352d506c2e3b0001d28acc058c > > 422b3c8f0c > > > > This results in the quorum healing about 30 seconds after the clients > > disconnect. > > > > Is there a way to prevent runaway growth in this queue? I'm wondering > > if changing the definition of "throttling" to take into account the > > size of this queue might help mitigate this. The end goal is that some > > stable amount of traffic is reached asymptotically without suffering a > > collapse. > > > > Thanks, > > -Mike > > > --94eb2c096f3ef671ce053ec4b967--