Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 72219 invoked from network); 14 Mar 2010 23:38:54 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 14 Mar 2010 23:38:54 -0000 Received: (qmail 22398 invoked by uid 500); 14 Mar 2010 23:38:09 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 22380 invoked by uid 500); 14 Mar 2010 23:38:09 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 22372 invoked by uid 99); 14 Mar 2010 23:38:09 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Mar 2010 23:38:09 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.218.217] (HELO mail-bw0-f217.google.com) (209.85.218.217) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Mar 2010 23:38:03 +0000 Received: by bwz9 with SMTP id 9so2889297bwz.25 for ; Sun, 14 Mar 2010 16:37:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.21.207 with SMTP id k15mr4729354bkb.72.1268609862635; Sun, 14 Mar 2010 16:37:42 -0700 (PDT) In-Reply-To: <2e191f541003141529v6a194f3dif5060420178f6ff2@mail.gmail.com> References: <2e191f541003141529v6a194f3dif5060420178f6ff2@mail.gmail.com> Date: Sun, 14 Mar 2010 19:37:42 -0400 Message-ID: Subject: Re: serialized vector clock as global counter? From: Dwight Merriman To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=00032555a7123892dd0481cb4001 X-Virus-Checked: Checked by ClamAV on apache.org --00032555a7123892dd0481cb4001 Content-Type: text/plain; charset=ISO-8859-1 yes - take a look at this app engine blog post: http://googleappengine.blogspot.com/2009/09/migration-to-better-datastore.html if i read this correctly, app engine data store is pretty much in the "strongly consistent" camp while cassandra is more eventually consistent -- so really quite different. you would get higher availability on an EC system but atomic updates become quite hard (at least when fully generalized) On Sun, Mar 14, 2010 at 6:29 PM, Fred Wulff wrote: > Hey Toby, > > I'm not an expert on Cassandra's infrastructure, but I believe the > thing the AppEngine datastore has that Cassandra doesn't is a > transaction between the read and write of a sharded counter. That > means that while the read of the various counters may be inconsistent, > the actual update of the shard is always consistent and the read of > that shard is always consistent with the previous write. > > -Fred > > On Sun, Mar 14, 2010 at 9:46 AM, Toby DiPasquale wrote: > > Hi all, > > > > I'm trying to write an application using Cassandra which requires the > > use of a global, monotonically-increasing counter. I've seen the > > previous threads on this subject which basically say that this can't > > be done in Cassandra as is, but I think I've come up with a method > > that might work. I wanted to get the list's feedback on whether or not > > this method is workable: > > > > * Each client maintains its own monotonically-increasing counter as a > > row in Cassandra > > * When a client wants to increment the counter, it will: > > * increment its own counter key using a quorum write > > * read all keys in the CF using a quorum read > > * the sum of the values is then the value of the counter > > > > This method is robust against nodes coming and going (new nodes just > > get a new counter and dead nodes never increase their counter again). > > It also doesn't matter for my application if some possible values for > > the counter are skipped over, as long as every value is greater than > > the last. I believe this scheme to be commensurate to a vector clock, > > no? > > > > My question would be: assuming we're using both quorum reads and > > writes, is it possible that clients A and B could race in the > > following manner: > > > > * A updates its counter > > * B updates its counter > > * A reads the keys to get sum X > > * B reads the keys to get the same sum X > > > > ...thus violating the ever-increasing constraint? > > > > Google App Engine suggests a similar method for doing global counters > > on Datastore: > http://code.google.com/appengine/articles/sharding_counters.html. > > I'm troubled by their implementation, though, because the reads on the > > list of counters are not transactional and are potentially subject to > > the same race that I've described above. > > > > Any thoughts/ideas? > > > > -- > > Toby DiPasquale > > > --00032555a7123892dd0481cb4001 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable yes - take a look at this app engine blog post:

http= ://googleappengine.blogspot.com/2009/09/migration-to-better-datastore.html<= /a>

if i read this correctly, app engine data store is pretty much in the &= quot;strongly consistent" camp while cassandra is more eventually cons= istent -- so really quite different.=A0 you would get higher availability o= n an EC system but atomic updates become quite hard (at least when fully ge= neralized)

On Sun, Mar 14, 2010 at 6:29 PM, Fred Wulff = <frew@stanford.ed= u> wrote:
Hey Toby,

I'm not an expert on Cassandra's infrastructure, but I believe the<= br> thing the AppEngine datastore has that Cassandra doesn't is a
transaction between the read and write of a sharded counter. That
means that while the read of the various counters may be inconsistent,
the actual update of the shard is always consistent and the read of
that shard is always consistent with the previous write.

-Fred

On Sun, Mar 14, 2010 at 9:46 AM, Toby DiPasquale <toby@cbcg.net> wrote:
> Hi all,
>
> I'm trying to write an application using Cassandra which requires = the
> use of a global, monotonically-increasing counter. I've seen the > previous threads on this subject which basically say that this can'= ;t
> be done in Cassandra as is, but I think I've come up with a method=
> that might work. I wanted to get the list's feedback on whether or= not
> this method is workable:
>
> * Each client maintains its own monotonically-increasing counter as a<= br> > row in Cassandra
> * When a client wants to increment the counter, it will:
> =A0* increment its own counter key using a quorum write
> =A0* read all keys in the CF using a quorum read
> =A0* the sum of the values is then the value of the counter
>
> This method is robust against nodes coming and going (new nodes just > get a new counter and dead nodes never increase their counter again).<= br> > It also doesn't matter for my application if some possible values = for
> the counter are skipped over, as long as every value is greater than > the last. I believe this scheme to be commensurate to a vector clock,<= br> > no?
>
> My question would be: assuming we're using both quorum reads and > writes, is it possible that clients A and B could race in the
> following manner:
>
> * A updates its counter
> * B updates its counter
> * A reads the keys to get sum X
> * B reads the keys to get the same sum X
>
> ...thus violating the ever-increasing constraint?
>
> Google App Engine suggests a similar method for doing global counters<= br> > on Datastore: http://code.google.com/appengine/art= icles/sharding_counters.html.
> I'm troubled by their implementation, though, because the reads on= the
> list of counters are not transactional and are potentially subject to<= br> > the same race that I've described above.
>
> Any thoughts/ideas?
>
> --
> Toby DiPasquale
>

--00032555a7123892dd0481cb4001--