Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B0B3638C for ; Fri, 24 Jun 2011 19:34:56 +0000 (UTC) Received: (qmail 89741 invoked by uid 500); 24 Jun 2011 19:34:54 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 89683 invoked by uid 500); 24 Jun 2011 19:34:54 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 89675 invoked by uid 99); 24 Jun 2011 19:34:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 19:34:54 +0000 X-ASF-Spam-Status: No, hits=4.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HK_RANDOM_ENVFROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of teddyyyy123@gmail.com designates 209.85.213.172 as permitted sender) Received: from [209.85.213.172] (HELO mail-yx0-f172.google.com) (209.85.213.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 19:34:47 +0000 Received: by yxp4 with SMTP id 4so1376752yxp.31 for ; Fri, 24 Jun 2011 12:34:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=QgFOu+RcYz3W7nvs44YaAJe+vsFg350MUlxHBuKhCfE=; b=jTttdggjTb5Gop9HL2S4saRcbPPaTENxa52qEXb9mXn49v02rfH3dEtQpNYVcQK/FE treg+B0nDoHFUMhVU0F5NjysZeqtj68ges6Qz/12YRNdnLTDuKFeuC6areI/dd745S3F UiHV75hA721SmZT3DPLkqLe7/PP+Qes0Soj64= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=a3QOkOpmbFcgap7/KCuUoANi2O+Htv6rjQw4Jo94qN0retPf+Kjx8dkNLBPuQzovtc HSEb4b8bOtimFRVyARt6CenltoYkWaI5/Js02wBNyvuVFR+k1DKqHcyN2x0JgduSV6mY 1KLpli1aHzvC4NQxmJ8qDovT8vcOrlhh0RSpE= MIME-Version: 1.0 Received: by 10.236.177.106 with SMTP id c70mr2634076yhm.307.1308944066177; Fri, 24 Jun 2011 12:34:26 -0700 (PDT) Received: by 10.236.69.130 with HTTP; Fri, 24 Jun 2011 12:34:26 -0700 (PDT) In-Reply-To: References: <4E04D867.4090900@dude.podzone.net> Date: Fri, 24 Jun 2011 12:34:26 -0700 Message-ID: Subject: Re: Concurrency: Does C* support a Happened-Before relation between processes' writes? From: Yang To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=20cf303f6abe186a0404a67a4a2a X-Virus-Checked: Checked by ClamAV on apache.org --20cf303f6abe186a0404a67a4a2a Content-Type: text/plain; charset=ISO-8859-1 by "possible node N", I mean possible clients that will ever try to do the locking On Fri, Jun 24, 2011 at 12:28 PM, Yang wrote: > without a clear description of your pseudo-code, it's difficult to say > whether it will work. > > but I think it can work fine as an election/agreement protocol, which you > can use as a lock to some degree, but this requires > all the potential lock contenders to all participate, you can't grab a lock > before everyone has voiced their vote yet > . let's say the code is like > > lock(UUID) { > > write_column( "token_"+my_hostname()+"_"+UUID, getTimeStamp() ); > for all possible node N : > block until ( token = read_column("token_" + N + "_" + UUID, > getTimeStamp()) ) != null > > return "LOCKED" if my timestamp is the lowest among all nodes > } > > > you have to wait for all nodes to voice their timestamp because the > timestamp in cassandra is client-generated, a latter node can create a > column with an earlier > timestamp. > > > On Fri, Jun 24, 2011 at 11:33 AM, AJ wrote: > >> Sorry, I know this is long-winded but I just want to make sure before I go >> through the trouble to implement this since it's not something that can be >> reliably tested and requires in-depth knowledge about C* internals. But, >> this ultimately deals with concurrency control so anyone interested in that >> subject may want to try and answer this. Thanks! >> >> >> I would like to know how to do a series of writes and reads such that I >> can tell definitively what process out of many was the first to create a >> unique flag column. >> >> IOW, I would like to have multiple processes (clients) compete to see who >> is first to write a token column. The tokens start with a known prefix, >> such as "Token_" followed by the name of the process that created it and a >> UUID so that all columns are guaranteed unique and don't get overwritten. >> For example, Process A could create: >> >> Token_ProcA_ >> >> and process B would create: >> >> Token_ProcB_ >> >> These writes/reads are asynchronous between the two or more processes. >> After the two processes write their respective tokens, each will read back >> all columns named "Token_*" that exist (a slice). They each do this in >> order to find out who "won". The process that wrote the column with the >> lowest timestamp wins. The purpose is to implement a lock. >> >> I think all that is required is for the processes to use QUORUM >> read/writes to make sure the final read is consistent and will assure each >> process that it can rely on what's returned from the final read and that >> there isn't an earlier write floating around somewhere. This is where the >> "happened-before" question comes in. Is it possible that Process A which >> writes it's token with a lower timestamp (and should be the winner), that >> this write may not be seen by Process B when it does it's read (which is >> after it's token write and after Process A wrote it's token), and thus >> conclude incorrectly that itself (Process B) is the winner since it will not >> see Process A's token? I'm 99% sure using QUORUM read/writes will allow >> this to work because that's the whole purpose, but I just wanted to >> double-check in case there's another detail I'm forgetting about C* that >> would defeat this. >> >> Thanks! >> >> P.S. I realize this will cost me in performance, but this is only meant >> to be used on occasion. >> > > --20cf303f6abe186a0404a67a4a2a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable by "possible node N", I mean possible clients that will ever try = to do the locking


On Fri, Jun 24, 20= 11 at 12:28 PM, Yang <teddyyyy123@gmail.com> wrote:
without a clear description of your ps= eudo-code, it's difficult to say whether it will work.

but I think it can work fine as an election/agreement protocol, which = you can use as a lock to some degree, but this requires
all the potential lock contenders to all participate, you can't gr= ab a lock before everyone has voiced their vote yet
. let's s= ay the code is like

lock(UUID) {
=A0 =A0=
=A0 write_column( "token_"+my_hostname()+"_"+UUID,= getTimeStamp() );
=A0 for all possible node N :
=A0 = =A0 =A0block =A0until ( token =3D read_column("token_" + N + &quo= t;_" + UUID, =A0getTimeStamp()) ) !=3D null

=A0return "LOCKED" if my timestamp is the low= est among all nodes
}


you= have to wait for all nodes to voice their timestamp because the timestamp = in cassandra is client-generated, a latter node can create a column with an= earlier
timestamp.=A0
=A0=A0
=

On Fri, Jun 24, 2011 at 11:33 AM, AJ <aj@dude.podzone.net> wrote:
Sorry, I know this is long-winded but I just want to make sure before I go = through the trouble to implement this since it's not something that can= be reliably tested and requires in-depth knowledge about C* internals. =A0= But, this ultimately deals with concurrency control so anyone interested in= that subject may want to try and answer this. =A0Thanks!


I would like to know how to do a series of writes and reads such that I can= tell definitively what process out of many was the first to create a uniqu= e flag column.

IOW, I would like to have multiple processes (clients) compete to see who i= s first to write a token column. =A0The tokens start with a known prefix, s= uch as "Token_" followed by the name of the process that created = it and a UUID so that all columns are guaranteed unique and don't get o= verwritten. =A0For example, Process A could create:

Token_ProcA_<UUID>

and process B would create:

Token_ProcB_<UUID>

These writes/reads are asynchronous between the two or more processes. =A0A= fter the two processes write their respective tokens, each will read back a= ll columns named "Token_*" that exist (a slice). =A0They each do = this in order to find out who "won". =A0The process that wrote th= e column with the lowest timestamp wins. =A0The purpose is to implement a l= ock.

I think all that is required is for the processes to use QUORUM read/writes= to make sure the final read is consistent and will assure each process tha= t it can rely on what's returned from the final read and that there isn= 't an earlier write floating around somewhere. =A0This is where the &qu= ot;happened-before" question comes in. =A0Is it possible that Process = A which writes it's token with a lower timestamp (and should be the win= ner), that this write may not be seen by Process B when it does it's re= ad (which is after it's token write and after Process A wrote it's = token), and thus conclude incorrectly that itself (Process B) is the winner= since it will not see Process A's token? =A0I'm 99% sure using QUO= RUM read/writes will allow this to work because that's the whole purpos= e, but I just wanted to double-check in case there's another detail I&#= 39;m forgetting about C* that would defeat this.

Thanks!

P.S. =A0I realize this will cost me in performance, but this is only meant = to be used on occasion.


--20cf303f6abe186a0404a67a4a2a--