Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 598F7200C23 for ; Wed, 22 Feb 2017 18:41:02 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 58201160B62; Wed, 22 Feb 2017 17:41:02 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 78884160B49 for ; Wed, 22 Feb 2017 18:41:01 +0100 (CET) Received: (qmail 96106 invoked by uid 500); 22 Feb 2017 17:41:00 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 96089 invoked by uid 99); 22 Feb 2017 17:41:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 22 Feb 2017 17:41:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 8BA3EC201F for ; Wed, 22 Feb 2017 17:40:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.68 X-Spam-Level: * X-Spam-Status: No, score=1.68 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=twitter.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id ba2oGh49UJT3 for ; Wed, 22 Feb 2017 17:40:56 +0000 (UTC) Received: from mail-it0-f43.google.com (mail-it0-f43.google.com [209.85.214.43]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id D4C835F1B8 for ; Wed, 22 Feb 2017 17:40:55 +0000 (UTC) Received: by mail-it0-f43.google.com with SMTP id 203so143687517ith.0 for ; Wed, 22 Feb 2017 09:40:55 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=fdH2R0yoHlmWq18dZA+6jGAxh+QwzNXOrbXqA85mEEk=; b=ipQrvowqBr8ITv6NZJ1UlDdTKCEbmsuG4O7XetZ7a1zkaXD8o9cLW/QktrBmwTQB/2 sQ8Ilblshs0/asPlv+1Lbyi8XkwJXAgq7+2AerRu3f+pu/gmM3RrsIwI2sxjBdQUhGLg CCpfqKft10+2SWqrA20WH+0j6GLnquKUBYYbJW4U2P9P4KrspJ/ZIaDTOIFwKfsGnY9s MwvmQiOQlnf5HQMH/tbSNhRrA4ue14DtkVHBkcD/jZgdrm3YtdNiVz1ufu+DWO381bMn frp/vHPd+yS+cjFS9n9J6araFIPjhDdx4HQ8y4V1RLT+fq//1VkU1/PU8RA2b5ab/8eA IfoA== X-Gm-Message-State: AMke39nkuZIy9CAvTP8OkA+YysCjg7O4uKzrs6kuXnKBRFK3JxB1KaowQosplCB7YmAPBSQ+/qBMukjOBfEg2Yxf X-Received: by 10.36.117.148 with SMTP id y142mr2823945itc.14.1487785254519; Wed, 22 Feb 2017 09:40:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.36.126.70 with HTTP; Wed, 22 Feb 2017 09:40:24 -0800 (PST) In-Reply-To: References: <82099c83-186e-e5ba-f18d-3b4f78785386@wingcon.com> From: Dan Benediktson Date: Wed, 22 Feb 2017 09:40:24 -0800 Message-ID: Subject: Re: etcd performance comparison To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=001a114aace2992f8f054921ffc0 archived-at: Wed, 22 Feb 2017 17:41:02 -0000 --001a114aace2992f8f054921ffc0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Performance benchmarking is a very hard problem, so let's keep that in mind before criticizing this one overmuch, and before going too far trying to build our own. I do agree that the benchmark chosen here is probably not the most useful in guiding customers to select among their options for coordination databases, so I like Jordan's suggestion: first define a small number of interesting benchmarks, based on common use cases for these coordination databases. On the topic of service discovery, I agree that's probably the #1 use case, so a benchmark trying to replicate that scenario would likely be the first and most important one to go after. To be honest, I would expect all existing ZK releases to perform much worse, by comparison, to etcd and consul, with any kind of mixed read and write workload, and I think it would would help demonstrate the benefits of the patch that recently landed in trunk, and any other subsequent performance-oriented patches we might go after, if we had some ready benchmarks which could clearly demonstrate the beneficial results of those patches. Thanks, Dan On Wed, Feb 22, 2017 at 6:52 AM, Camille Fournier wrote: > Even just writing about what objective tests might look like would be a > good start! I'm happy to read draft posts by anyone who wishes to write o= n > the topic. > > C > > On Wed, Feb 22, 2017 at 9:36 AM, Jordan Zimmerman < > jordan@jordanzimmerman.com> wrote: > > > IMO there is tremendous FUD in the etcd world. It's the new cool toy an= d > > ZK feels old. To suggest that ZK does not do Service Discovery is > > ludicrous. That was one of the very first Curator recipes. > > > > It might be useful to counter this trend objectively. I'd be interested > in > > helping. Anyone else? We can create objective tests that compare common > use > > cases. > > > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Jordan Zimmerman > > > > > On Feb 22, 2017, at 11:21 AM, Camille Fournier > > wrote: > > > > > > I think that my biggest feeling about this blog post (besides not > > > disclosing the disk setup clearly) is that, ZK is really not designed > to > > > have massive write throughput. I would not traditionally recommend > > someone > > > use ZK in that manner. If we think that evolving it to be useful for > such > > > workloads would be good, it could be an interesting community > discussion, > > > but it's really not the purpose of the system design. > > > > > > I'd love to see a more read/write mixed load test for the systems, as > > well > > > as a blog post about why you might choose different systems for > different > > > workloads. I think developers have a hard time really understanding t= he > > > tradeoffs they are choosing in these systems, because of the nuance > > around > > > them. > > > > > > For me, I'm more concerned about the fact that I saw a talk yesterday > > that > > > mentioned both etcd and consul as options for service discovery but n= ot > > ZK. > > > That feels like a big hit for our community. Orthogonal to this topic= , > > just > > > feels worth mentioning. > > > > > > C > > > > > > On Wed, Feb 22, 2017 at 4:05 AM, Alexander Binzberger < > > > alexander.binzberger@wingcon.com> wrote: > > > > > >> 1. Seams like it might make sense to increase snapCount for those > tests. > > >> > > >> 2. ZK write performance also depends on the number of watches - afai= k. > > >> This is not mentioned and not tested. > > >> > > >> 3. Does it really make sense to "blast" the store? Wouldn't it make > more > > >> sense to compare fixed write/read per clients rates? > > >> > > >> > > >> > > >>> Am 22.02.2017 um 05:53 schrieb Michael Han: > > >>> > > >>> Kudus to etcd team for making this blog and thanks for sharing. > > >>> > > >>> I feel like they're running a questionable configuration. > > >>>>> > > >>>> Looks like the test configuration > > >>> > >>> 235c20878a8637f24608c/agent/agent_zookeeper.go#L29> > > >>> does not have separate directory for transaction logs and snapshots > as > > it > > >>> does not have configuration for dataLogDir. So the configuration is > not > > >>> optimal. Would be interesting to see the numbers with updated > > >>> configuration. > > >>> > > >>> They mention that ZK snapshots "stop the world", and maybe I'm > > mistaken, > > >>>>> but > > >>>>> > > >>>> I didn't think that was right > > >>> > > >>> Right, ZK snapshots does not block processing pipeline as it is fuz= zy > > and > > >>> it is done in a separate thread. The warning message "*To busy to > snap, > > >>> skipping*" mentioned in the blog is a sign that a snap shot is also > > >>> generating in progress, which could be caused by the write > contentions > > >>> created from serializing transaction logs that leads to longer than > > >>> expected snap shot generation. So "stop the world" is a side effect > of > > >>> resource contention, but not a design intention IMO. > > >>> > > >>> Also the blog mentions ZooKeeper as a key value store and I also wa= nt > > to > > >>> point out that ZooKeeper is more than a (metadata) key value store > has > > >>> features such as sessions, ephemerals, and watchers, and these desi= gn > > >>> choices were made I believe to make ZK more useful as a coordinatio= n > > >>> kernel, and these design choice also (negatively) contribute to the > > >>> performance and scalability of ZooKeeper. > > >>> > > >>> > > >>> On Tue, Feb 21, 2017 at 4:32 PM, Dan Benediktson < > > >>> dbenediktson@twitter.com.invalid> wrote: > > >>> > > >>> I kind of wonder about them only using one disk. I haven't > experimented > > >>>> with this in ZooKeeper, nor have I ever been a DBA, but with > > traditional > > >>>> database systems (which ZooKeeper should be basically identical to= , > in > > >>>> this > > >>>> regard), it's a pretty common recommendation to put snapshots and > > TxLogs > > >>>> on > > >>>> different drives, for the obvious reason of avoiding one of the > > biggest > > >>>> problems laid out in that blog post: when snapshot happens, it > > contends > > >>>> with your log flushes, causing write latencies to explode. Suddenl= y > > you > > >>>> have tons more IO, and where it used to be nicely sequential, now > it's > > >>>> heavily randomized because of the two competing writers. It's kind > of > > the > > >>>> nature of benchmarks that there's always something you can nitpick= , > > but > > >>>> still, I feel like they're running a questionable configuration. > > >>>> > > >>>> They mention that ZK snapshots "stop the world", and maybe I'm > > mistaken, > > >>>> but I didn't think that was right - I thought they were just slowi= ng > > >>>> everything down because they write a lot and contend a lot. I'm > pretty > > >>>> sure > > >>>> ZK snapshots are fuzzy over a range of transactions, and > transactions > > >>>> keep > > >>>> applying during the snapshot, right? > > >>>> > > >>>> Thanks, > > >>>> Dan > > >>>> > > >>>> On Tue, Feb 21, 2017 at 2:24 PM, Benjamin Mahler < > > bmahler@mesosphere.io> > > >>>> wrote: > > >>>> > > >>>> I'm curious if folks here have seen the following write performanc= e > > >>>>> comparison that was done by CoreOS on etc, Consul, and ZooKeeper: > > >>>>> https://coreos.com/blog/performance-of-etcd.html > > >>>>> > > >>>>> Sounds like performance comparison of reads and updates are comin= g > > next. > > >>>>> Are there any thoughts from folks here on this comparison so far? > > >>>>> > > >>>>> Thanks, > > >>>>> Ben > > >>>>> > > >>>>> > > >> -- > > >> Alexander Binzberger > > >> System Designer - WINGcon AG > > >> Tel. +49 7543 966-119 > > >> > > >> Sitz der Gesellschaft: Langenargen > > >> Registergericht: ULM, HRB 734260 > > >> USt-Id.: DE232931635, WEEE-Id.: DE74015979 > > >> Vorstand: thomasThomas Ehrle (Vorsitz), Fritz R. Paul > (Stellvertreter), > > >> Tobias Tre=C3=9F > > >> Aufsichtsrat: J=C3=BCrgen Maucher (Vorsitz), Andreas Paul (Stellvert= reter), > > >> Martin Sauter > > >> > > >> > > > --001a114aace2992f8f054921ffc0--