Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@zookeeper.apache.org
MIME-Version: 1.0
In-Reply-To: <CABWqe2ZwLx1e=8t1vR0Ox4ycCbOQ-XqRyUUSaWM+1V7QjxFi5w@mail.gmail.com>
References: <CAFp_Nis+pzKna-g65YecgVWLFXQsLN37kHMykWSjTNqQpgq1aA@mail.gmail.com>
 <CACobuG1Qn_Gw1TUbzjsiAhcznEB6XPvjWSGtKEtfEoJ30oEF9A@mail.gmail.com>
 <CA+Vv-uj9ZOussE_cA9qzYA5ZRTrsgKT8dw5y-4doD62wPGna4A@mail.gmail.com>
 <82099c83-186e-e5ba-f18d-3b4f78785386@wingcon.com> <CABWqe2boArd1Ra828YUGwQ1s44JgvV9K9gf4xc0cVrU+afkJYA@mail.gmail.com>
 <B346A2AF-8B86-4F36-98D8-3A527833505C@jordanzimmerman.com> <CABWqe2ZwLx1e=8t1vR0Ox4ycCbOQ-XqRyUUSaWM+1V7QjxFi5w@mail.gmail.com>
From: Dan Benediktson <dbenediktson@twitter.com.INVALID>
Date: Wed, 22 Feb 2017 09:40:24 -0800
Message-ID: <CACobuG3MGUbX_+6NL_b7t==GP1UzgDL_YEmM02XFh+_s_Tx-fw@mail.gmail.com>
Subject: Re: etcd performance comparison
To: user@zookeeper.apache.org
Content-Type: multipart/alternative; boundary=001a114aace2992f8f054921ffc0
archived-at: Wed, 22 Feb 2017 17:41:02 -0000

--001a114aace2992f8f054921ffc0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Performance benchmarking is a very hard problem, so let's keep that in mind
before criticizing this one overmuch, and before going too far trying to
build our own. I do agree that the benchmark chosen here is probably not
the most useful in guiding customers to select among their options for
coordination databases, so I like Jordan's suggestion: first define a small
number of interesting benchmarks, based on common use cases for these
coordination databases. On the topic of service discovery, I agree that's
probably the #1 use case, so a benchmark trying to replicate that scenario
would likely be the first and most important one to go after.

To be honest, I would expect all existing ZK releases to perform much
worse, by comparison, to etcd and consul, with any kind of mixed read and
write workload, and I think it would would help demonstrate the benefits of
the patch that recently landed in trunk, and any other subsequent
performance-oriented patches we might go after, if we had some ready
benchmarks which could clearly demonstrate the beneficial results of those
patches.

Thanks,
Dan

On Wed, Feb 22, 2017 at 6:52 AM, Camille Fournier <camille@apache.org>
wrote:

> Even just writing about what objective tests might look like would be a
> good start! I'm happy to read draft posts by anyone who wishes to write o=
n
> the topic.
>
> C
>
> On Wed, Feb 22, 2017 at 9:36 AM, Jordan Zimmerman <
> jordan@jordanzimmerman.com> wrote:
>
> > IMO there is tremendous FUD in the etcd world. It's the new cool toy an=
d
> > ZK feels old. To suggest that ZK does not do Service Discovery is
> > ludicrous. That was one of the very first Curator recipes.
> >
> > It might be useful to counter this trend objectively. I'd be interested
> in
> > helping. Anyone else? We can create objective tests that compare common
> use
> > cases.
> >
> > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> > Jordan Zimmerman
> >
> > > On Feb 22, 2017, at 11:21 AM, Camille Fournier <camille@apache.org>
> > wrote:
> > >
> > > I think that my biggest feeling about this blog post (besides not
> > > disclosing the disk setup clearly) is that, ZK is really not designed
> to
> > > have massive write throughput. I would not traditionally recommend
> > someone
> > > use ZK in that manner. If we think that evolving it to be useful for
> such
> > > workloads would be good, it could be an interesting community
> discussion,
> > > but it's really not the purpose of the system design.
> > >
> > > I'd love to see a more read/write mixed load test for the systems, as
> > well
> > > as a blog post about why you might choose different systems for
> different
> > > workloads. I think developers have a hard time really understanding t=
he
> > > tradeoffs they are choosing in these systems, because of the nuance
> > around
> > > them.
> > >
> > > For me, I'm more concerned about the fact that I saw a talk yesterday
> > that
> > > mentioned both etcd and consul as options for service discovery but n=
ot
> > ZK.
> > > That feels like a big hit for our community. Orthogonal to this topic=
,
> > just
> > > feels worth mentioning.
> > >
> > > C
> > >
> > > On Wed, Feb 22, 2017 at 4:05 AM, Alexander Binzberger <
> > > alexander.binzberger@wingcon.com> wrote:
> > >
> > >> 1. Seams like it might make sense to increase snapCount for those
> tests.
> > >>
> > >> 2. ZK write performance also depends on the number of watches - afai=
k.
> > >> This is not mentioned and not tested.
> > >>
> > >> 3. Does it really make sense to "blast" the store? Wouldn't it make
> more
> > >> sense to compare fixed write/read per clients rates?
> > >>
> > >>
> > >>
> > >>> Am 22.02.2017 um 05:53 schrieb Michael Han:
> > >>>
> > >>> Kudus to etcd team for making this blog and thanks for sharing.
> > >>>
> > >>> I feel like they're running a questionable configuration.
> > >>>>>
> > >>>> Looks like the test configuration
> > >>> <https://github.com/coreos/dbtester/blob/89eb8d31addff1d9538
> > >>> 235c20878a8637f24608c/agent/agent_zookeeper.go#L29>
> > >>> does not have separate directory for transaction logs and snapshots
> as
> > it
> > >>> does not have configuration for dataLogDir. So the configuration is
> not
> > >>> optimal. Would be interesting to see the numbers with updated
> > >>> configuration.
> > >>>
> > >>> They mention that ZK snapshots "stop the world", and maybe I'm
> > mistaken,
> > >>>>> but
> > >>>>>
> > >>>> I didn't think that was right
> > >>>
> > >>> Right, ZK snapshots does not block processing pipeline as it is fuz=
zy
> > and
> > >>> it is done in a separate thread. The warning message "*To busy to
> snap,
> > >>> skipping*" mentioned in the blog is a sign that a snap shot is also
> > >>> generating in progress, which could be caused by the write
> contentions
> > >>> created from serializing transaction logs that leads to longer than
> > >>> expected snap shot generation. So "stop the world" is a side effect
> of
> > >>> resource contention, but not a design intention IMO.
> > >>>
> > >>> Also the blog mentions ZooKeeper as a key value store and I also wa=
nt
> > to
> > >>> point out that ZooKeeper is more than a (metadata) key value store
> has
> > >>> features such as sessions, ephemerals, and watchers, and these desi=
gn
> > >>> choices were made I believe to make ZK more useful as a coordinatio=
n
> > >>> kernel, and these design choice also (negatively) contribute to the
> > >>> performance and scalability of ZooKeeper.
> > >>>
> > >>>
> > >>> On Tue, Feb 21, 2017 at 4:32 PM, Dan Benediktson <
> > >>> dbenediktson@twitter.com.invalid> wrote:
> > >>>
> > >>> I kind of wonder about them only using one disk. I haven't
> experimented
> > >>>> with this in ZooKeeper, nor have I ever been a DBA, but with
> > traditional
> > >>>> database systems (which ZooKeeper should be basically identical to=
,
> in
> > >>>> this
> > >>>> regard), it's a pretty common recommendation to put snapshots and
> > TxLogs
> > >>>> on
> > >>>> different drives, for the obvious reason of avoiding one of the
> > biggest
> > >>>> problems laid out in that blog post: when snapshot happens, it
> > contends
> > >>>> with your log flushes, causing write latencies to explode. Suddenl=
y
> > you
> > >>>> have tons more IO, and where it used to be nicely sequential, now
> it's
> > >>>> heavily randomized because of the two competing writers. It's kind
> of
> > the
> > >>>> nature of benchmarks that there's always something you can nitpick=
,
> > but
> > >>>> still, I feel like they're running a questionable configuration.
> > >>>>
> > >>>> They mention that ZK snapshots "stop the world", and maybe I'm
> > mistaken,
> > >>>> but I didn't think that was right - I thought they were just slowi=
ng
> > >>>> everything down because they write a lot and contend a lot. I'm
> pretty
> > >>>> sure
> > >>>> ZK snapshots are fuzzy over a range of transactions, and
> transactions
> > >>>> keep
> > >>>> applying during the snapshot, right?
> > >>>>
> > >>>> Thanks,
> > >>>> Dan
> > >>>>
> > >>>> On Tue, Feb 21, 2017 at 2:24 PM, Benjamin Mahler <
> > bmahler@mesosphere.io>
> > >>>> wrote:
> > >>>>
> > >>>> I'm curious if folks here have seen the following write performanc=
e
> > >>>>> comparison that was done by CoreOS on etc, Consul, and ZooKeeper:
> > >>>>> https://coreos.com/blog/performance-of-etcd.html
> > >>>>>
> > >>>>> Sounds like performance comparison of reads and updates are comin=
g
> > next.
> > >>>>> Are there any thoughts from folks here on this comparison so far?
> > >>>>>
> > >>>>> Thanks,
> > >>>>> Ben
> > >>>>>
> > >>>>>
> > >> --
> > >> Alexander Binzberger
> > >> System Designer - WINGcon AG
> > >> Tel. +49 7543 966-119
> > >>
> > >> Sitz der Gesellschaft: Langenargen
> > >> Registergericht: ULM, HRB 734260
> > >> USt-Id.: DE232931635, WEEE-Id.: DE74015979
> > >> Vorstand: thomasThomas Ehrle (Vorsitz), Fritz R. Paul
> (Stellvertreter),
> > >> Tobias Tre=C3=9F
> > >> Aufsichtsrat: J=C3=BCrgen Maucher (Vorsitz), Andreas Paul (Stellvert=
reter),
> > >> Martin Sauter
> > >>
> > >>
> >
>

--001a114aace2992f8f054921ffc0--