zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pramod Biligiri <pramodbilig...@gmail.com>
Subject Re: Partitioned Zookeeper
Date Mon, 19 May 2014 03:25:00 GMT
[Let me know if you want this thread moved to the Dev list (or even to
JIRA). I was only seeing automated mails there so I thought I'll go ahead
and post here]

I have been looking at the codebase the last couple of days (see my notes
regarding the same here:

We are planning to do a proof-of-concept for the partitioning concept as
part of a class project, and measure any possible performance gains. Since
we're new to Zookeeper and short on time, it may not be the *right* way to
do it, but I hope it can give some pointers for the future.

Design approaches to implement a partitioned Zookeeper

For starters, let's assume we only parallelize accesses to paths starting
with a different top-level prefix, i.e. /app1/child1, /app2/child1, /config

Possible approach:

Have a different tree object for each top-level node (/app1, /app2 etc).
This loosely corresponds to a container in the Wiki page [1], and
corresponds to the DataTree class in the codebase

- As soon as a request comes in, associate it with one of the trees. Since
each request necessarily has a path associated with it, this is possible.

- Then, all the queues that are used to process requests should operate
parallelly on these different trees. This can be done by having multiple
queues - one for each container.

Potential issues:

- Whether ZK code is designed to work with multiple trees instead of just

- Whether the queuing process (which uses RequestProcessors) is designed to
handle multiple queues

- Make sure performance actually improves, and does not degrade!


- Where is the performance benefit actually going to come from?

Intuitively, we might think that parallel trees might give a benefit, but
since each node logs all change records to disk before applying them, isn't
disk the throughput bottleneck? If I remember right, the ZK paper says that
with proper configs, they are able to make ZK I/O bound.

So along with having separate trees and associated processing, should we
also have separate logging to disk for each tree? Will this actually help
in improving write speeds to disk?


1. The wiki page:

2. The JIRA discussion: https://issues.apache.org/jira/browse/ZOOKEEPER-646

3. In this blog post, see the section called Scalability and Hashing
Zookeeper clusters:


On Fri, May 16, 2014 at 10:56 PM, Pramod Biligiri

> Thanks Michi,
> That was a very useful link! :)
> Pramod
> On Fri, May 16, 2014 at 3:37 PM, Michi Mutsuzaki <michi@cs.stanford.edu>wrote:
>> Hi Pramod,
>> No it has not been implemented, and I'm not aware of any recipes.
>> There is an open JIRA for this feature.
>> https://issues.apache.org/jira/browse/ZOOKEEPER-646
>> On Thu, May 15, 2014 at 12:59 PM, Pramod Biligiri
>> <pramodbiligiri@gmail.com> wrote:
>> > Hi,
>> > The Zookeeper wiki talks about Partitioned Zookeeper:
>> >
>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/PartitionedZooKeeper
>> >
>> > I wanted to know if that has already been implemented or not. If not,
>> are
>> > there some recipes which can make Zookeeper behave in that way?
>> >
>> > Thanks.
>> >
>> > Pramod

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message