Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@zookeeper.apache.org
Received-SPF: pass (nike.apache.org: domain of henry@cloudera.com designates
 209.85.214.174 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <BAY169-W99C7414A8F0BAD601BE045EE570@phx.gbl>
References: <BAY403-EAS414CC5EA19C1FA5F7B0CB24EE570@phx.gbl>
 <CAKF1A=vT-TzPCqSCguSKSJ8DivQWuHckJWEvMfsEhcFjpH5c2A@mail.gmail.com>
 <BAY169-W99C7414A8F0BAD601BE045EE570@phx.gbl>
From: Henry Robinson <henry@cloudera.com>
Date: Wed, 31 Jul 2013 16:05:14 -0700
Message-ID: 
 <CADZba-TP=ouwMt9-vYgkZdK_KS3yiCAzeiHUMnHGB0=xr8ESeQ@mail.gmail.com>
Subject: Re: Zookeeper performance
To: "user@zookeeper.apache.org" <user@zookeeper.apache.org>
Content-Type: multipart/alternative; boundary=001a11c2f3d854c84304e2d6c381

--001a11c2f3d854c84304e2d6c381
Content-Type: text/plain; charset=ISO-8859-1

So how about the following optimistic approach:

1. Read the current version of the database (stored in a znode's version
metadata). If it is even, wait and try again; even numbers mean someone is
committing and the DB might be in an inconsistent state. Then read the
state from the database your update will rely upon (user1.name, in this
instance). You must also be able to atomically read the current version
from the database as well as zookeeper, to ensure that the data is from the
version you think it is. If the DB version does not match the ZK version,
restart.
2. Once an update is ready to commit, test-and-increment the current
version in ZK to an even number, write your update to the DB, along with
the eventual version of the data (the next odd number).
3. Increment the current version in ZK to an odd number.

The even / odd distinction means that you can detect when someone else is
updating the database, since otherwise there's no way to do so atomically
with an update to ZK (so another transaction can't tell if you've finished
your update or not, and so doesn't know when to wait until).

The problem is failure - what happens if a client fails while it's writing
a transaction? Eventually someone can increment the transaction number, and
if you provide an 'undo' log before you make any changes, that client can
possibly recover from a partial commit. But at this point you need to
understand your application's requirements in much more detail than we do
to make recommendations.

In particular, your storage layer may offer sufficiently powerful
primitives such that you don't need ZK; although if it's a filesystem then
that probably isn't true.

Henry


On 31 July 2013 15:51, Baskar Duraikannu <baskar.duraikannu@outlook.com>wrote:

> We cannot always resolve conflicts ourselves. For example, let us say that
> a) user1 changed the name from 'Kathy' to Katherineb) user2 changes the
> name from 'Kathy' to 'Kat'
> Both read 'Kathy' as input; user1's update succeeded. If we need to let
> user2 know that something has changed as this may result in the user not
> changing 'Kathy' to 'Kat' (as an example).
> Hope this explains
>
> > Date: Wed, 31 Jul 2013 07:49:39 -0400
> > Subject: Re: Zookeeper performance
> > From: camille@apache.org
> > To: user@zookeeper.apache.org
> >
> > This sounds highly error prone to me regardless of whether or not
> zookeeper
> > can handle the load-. Why not just use a standard transaction model with
> a
> > vector clock or other timing device to detect conflicts so you don't have
> > to worry about a second server to talk to (zookeeper) to do an update?
> > On Jul 31, 2013 7:17 AM, "Baskar Duraikannu" <
> baskar.duraikannu@outlook.com>
> > wrote:
> >
> > > Hello
> > >
> > > We are looking to use zookeeper for optimistic concurrency. Basically
> when
> > > the user saves data on a screen, we need to lock,  read to ensure that
> no
> > > one else has changed the row while user is editing data, persist data
> and
> > > unlock znode.
> > >
> > > If the app/thread does not get a lock, we may set a watch so that
> polling
> > > is avoided.
> > >
> > > Our application is write intensive certain times of the day. We may get
> > > about 100k requests per second.  Can zookeeper handle this volume?
>
>


-- 
Henry Robinson
Software Engineer
Cloudera
415-994-6679

--001a11c2f3d854c84304e2d6c381--