Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 96710 invoked from network); 13 Sep 2010 22:46:32 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 13 Sep 2010 22:46:32 -0000 Received: (qmail 50949 invoked by uid 500); 13 Sep 2010 22:46:30 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 50847 invoked by uid 500); 13 Sep 2010 22:46:29 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 50839 invoked by uid 99); 13 Sep 2010 22:46:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Sep 2010 22:46:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a59.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Sep 2010 22:46:24 +0000 Received: from homiemail-a59.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTP id 686A156406A for ; Mon, 13 Sep 2010 15:46:03 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:date:message-id:content-type:mime-version:in-reply-to; q=dns; s=thelastpickle.com; b=BYuS/b3hO5bKrh5MUwQfDEN3X+AqvDKYt XnnVii5p23Z3edwrkNMgb/DbghVuExb++zk3JvYo0ta8dVOpSrySUEa7gj2iYSnJ dghcRILDEkJ3Oib5regZAUEOtqXw0qW7sSEt0p1QuRJo55KD9RNMhmW5fRG1qKvD kprTmEb5UI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:date:message-id:content-type:mime-version: in-reply-to; s=thelastpickle.com; bh=Cw/4oBYR7cKGzBcgzQYdBsRcUF8 =; b=MZgJ0vIJoqLEvLlUFE8+aCtBEm2ot7Z4o71EKE65aSIuU1BMPgjmUIwwSSX y1KxyBBgF5GWiikiIzLJ5dJQACAso+6HnZyfz8DwkRoBavSxatgySXQGHZi5RjYc hwi4Kr5hgevHI+jOlp/DDZ6J7UycSfTXFZWuePk+vtT249sQ= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTPSA id 5A6CA564061 for ; Mon, 13 Sep 2010 15:46:03 -0700 (PDT) To: user@cassandra.apache.org From: Aaron Morton Subject: Re: Implementing locks using cassandra only Date: Mon, 13 Sep 2010 22:46:01 GMT X-Mailer: MobileMe Mail (1C3202) Message-id: Content-Type: multipart/alternative; boundary=Apple-Webmail-42--f7bb4702-124b-ee4a-d911-baa950c23c71 MIME-Version: 1.0 In-Reply-To: --Apple-Webmail-42--f7bb4702-124b-ee4a-d911-baa950c23c71 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; format=flowed Thanks for the post it looks really handy. I found this page to help my si= mple mind understand whats going on=A0http://nob.cs.ucdavis.edu/classes/ec= s150-1999-02/sync-bakery.html=A0=0A=0A=A0I've got some thoughts / question= s mostly about implementation...=0A=0A1. Any thoughts on how it would work= with a dynamic number of clients? For example say I've got a python web a= pp, on 2 web servers each with 8 tornado web servers running which support= async IO. If I only every do sync IO to cassandra there will only every b= e 16 clients. If a request uses async IO, it may yield and allow the web s= erver to process another request while it's waiting using another client c= onnection. =A0=0A=0ACould I use a TimeUUID as the client ID and replace ma= x_clients with just the number of columns in the Choosing CF? This may cau= se a problem with the loop at line 8, as it would potentially change the c= ount. Or use a TimeUUID as the client_id and just have a max_clients set s= uitable high in config.=0A=0A2. What about if you re-read the number_hash = (as well as the choosing hash) when you detected a client was choosing at = line 9? This may be a little safer, as I guess the original algorithm assu= mes the numbers array is shared memory.=A0=0A=0A3. Your check at line 13 m= isses the comparison of the client id's as a tie breaker. Was this intenti= onal?=A0=0A=0A4. Cassandra 0.7 has a TTL on the column values. Perhaps thi= s could be used in the Numbers and even the Choosing CF? Say give clients = 1 second to get through their choosing section and 10 seconds to get throu= gh their lock.=A0=0A=0A=A0I've not read through the=A0distributed=A0work q= ueue section yet, hope to later today.=0A=0AThanks=0AAaron=0A=0AOn 14 Sep,= 2010,at 05:50 AM, Jonathan Ellis wrote:=0A=0ABakery a= lgorithm is really interesting -- AFAIK it's the only mutex=0Aalgorithm th= at doesn't assume an existing atomic operation like CAS.=0A=0AOn Mon, Sep = 13, 2010 at 11:46 AM, Thorsten von Eicken=0A wrote:=0A= > I've been musing about how to implement locks using just cassandra, i.e.= =0A> without sql db or zookeeper. I wrote up what I've come up with on the= wiki=0A> (it's a bit too long for an email) at=0A> . I'm wondering whether I've=0A> overlooked something,= especially I'm not yet comfortable enough with the=0A> semantics of faili= ng cassandra write operations to figure out whether these=0A> can cause pr= oblems, so I suspect some further iterations on the algorithm=0A> will be = necessary.=0A>=0A=0A=0A=0A-- =0AJonathan Ellis=0AProject Chair, Apache Cas= sandra=0Aco-founder of Riptano, the source for professional Cassandra supp= ort=0Ahttp://riptano.com=0A --Apple-Webmail-42--f7bb4702-124b-ee4a-d911-baa950c23c71 Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--f7bb4702-124b-ee4a-d911-baa950c23c71 --Apple-Webmail-86--f7bb4702-124b-ee4a-d911-baa950c23c71 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1;
Thanks for the post it looks really handy. I found this page to h= elp my simple mind understand whats going on http://nob.cs.ucdavis.edu/classes/ecs150-1999-02/sy= nc-bakery.html 

 I've got some thoughts / questions mostly= about implementation...

1. Any though= ts on how it would work with a dynamic number of clients? For example say = I've got a python web app, on 2 web servers each with 8 tornado web server= s running which support async IO. If I only every do sync IO to cassandra = there will only every be 16 clients. If a request uses async IO, it may yi= eld and allow the web server to process another request while it's waiting= using another client connection.  

Could I = use a TimeUUID as the client ID and replace max_clients with just the numb= er of columns in the Choosing CF? This may cause a problem with the loop a= t line 8, as it would potentially change the count. Or use a TimeUUID as t= he client_id and just have a max_clients set suitable high in config.

2. What about if you re-read the number_hash (as well= as the choosing hash) when you detected a client was choosing at line 9? = This may be a little safer, as I guess the original algorithm assumes the = numbers array is shared memory. 

3. Your che= ck at line 13 misses the comparison of the client id's as a tie breaker. W= as this intentional? 

4. Cassandr= a 0.7 has a TTL on the column values. Perhaps this could be used in the Nu= mbers and even the Choosing CF? Say give clients 1 second to get through t= heir choosing section and 10 seconds to get through their lock. 

 I've not read through the distributed = ;work queue section yet, hope to later today.

Tha= nks
Aaron

On 14 Sep, 2010,at 05:50 AM, Jonathan Ellis &= lt;jbellis@gmail.com> wrote:

Bakery algorithm is really interesting -- A= FAIK it's the only mutex
=0Aalgorithm that doesn't assume an existing a= tomic operation like CAS.
=0A
=0AOn Mon, Sep 13, 2010 at 11:46 AM, T= horsten von Eicken
=0A<tve@rightscale.com> wrote:
=0A> I've= been musing about how to implement locks using just cassandra, i.e.
=0A= > without sql db or zookeeper. I wrote up what I've come up with on the= wiki
=0A> (it's a bit too long for an email) at
=0A> <http://wiki.apache.org/cassandra/Locking= >. I'm wondering whether I've
=0A> overlooked something, especial= ly I'm not yet comfortable enough with the
=0A> semantics of failing= cassandra write operations to figure out whether these
=0A> can cau= se problems, so I suspect some further iterations on the algorithm
=0A&= gt; will be necessary.
=0A>
=0A
=0A
=0A
=0A--
=0AJon= athan Ellis
=0AProject Chair, Apache Cassandra
=0Aco-founder of Ript= ano, the source for professional Cassandra support
=0Ahttp://riptano.com
=0A=
--Apple-Webmail-86--f7bb4702-124b-ee4a-d911-baa950c23c71-- --Apple-Webmail-42--f7bb4702-124b-ee4a-d911-baa950c23c71--