incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@yakaz.com>
Subject Re: Atomic Compare and Swap
Date Tue, 22 Jun 2010 08:21:51 GMT
On Mon, Jun 21, 2010 at 11:19 PM, Rishi Bhardwaj <khichrishi@yahoo.com> wrote:
> I have read the post on cages and it is definitely very interesting. But
> cages seems to be too coarse grained compared to an Atomic Compare and Swap
> on Cassandra column value. Cages would makes sense when one wants to do
> multiple atomic row, column updates. Also, I am not so sure about the
> scalability when it comes to using zookeeper for keeping locks on Cassandra
> columns... there would also be performance hit with an added RPC for every
> write. I feel Cages maybe fine for systems when one has few locks but I feel
> an atomic CAS in Cassandra would help us avoid distributed locking systems
> and zookeeper in many other simpler scenarios. For more complicated
> (transaction like) things, using Cages may be fine. Then again doing a read
> before write for CAS in cassandra will make CAS at least as slow as a read,
> which I believe will still be better than taking a single column lock from
> zookeeper.
>
> What do other folks think in this regard? From whatever I have read, I
> believe CAS is feasible in Cassandra without hurting the normal write path
> performance. Only for CAS writes would we have to pay for the read before
> write penalty. I am going to do feasibility study for this and would love
> any pointers from others about this.

Making a (non atomic) CAS is easy (doing it client side is fine, and there
has been some discussion about 'callbacks' that may or may not someday allow
to do that server-side).

An *atomic* CAS is another beast and I see at least two difficulties:

1) making it atomic locally: Cassandra's implementation is very much
multi-threaded. On a given node, while you're reading-comparing-and-swapping
on some column c, no other thread should be allowed to write c (even 'normal'
write). You would probably need to have specific column families where CAS is
allowed and for which all writes would be slower (since some locking would be
involved). Even then, making such locking efficient and right is not easy. But
in the end, local atomicity is quite probably the easy part.

2) making it atomic cluster-wide: data is replicated and an atomic CAS would
need to apply on the exact same column version in every node. Which, with
eventual consistency especially, is pretty hard to accomplish unless you're
locking the cluster (but that's what Cages/ZK do).

That being said, if you have a neat solution for efficient and distributed
atomic CAS that doesn't require rewriting 80% of Cassandra, I'm sure there
will be interest in that.

--
Sylvain

>
> Thanks,
> Rishi
>
>
>
> ________________________________
> From: Rauan Maemirov <rauan@maemirov.com>
> To: dev@cassandra.apache.org
> Sent: Mon, June 21, 2010 11:27:02 AM
> Subject: Re: Atomic Compare and Swap
>
> Have you read this post?
> http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
> I guess, you will like it.
>
> 2010/6/22 Rishi Bhardwaj <khichrishi@yahoo.com>
>
>> I am definitely interested in taking this work up. I believe the CAS
>> functionality would help in a lot of different scenarios and could help
>> avoid use of other external services (like zookeeper) to provide similar
>> functionality. I am new at Cassandra development and would really appreciate
>> pointers from the dev. community about how to approach/start on this
>> project. Also how feasible is the approach mentioned below to implement the
>> CAS functionality? It would be great if we could have a discussion on the
>> pros and cons.
>>
>> Thanks,
>> Rishi
>>
>>
>>
>> ________________________________
>> From: Sriram Srinivasan <sriram@malhar.net>
>> To: dev@cassandra.apache.org
>> Sent: Sun, June 20, 2010 9:47:37 PM
>> Subject: Re: Atomic Compare and Swap
>>
>>
>> I too am interested in a CAS facility.
>>
>> I like Rishi's proposal. One could simply use a version number as the
>> logical timestamp. If we promote CAS to a consistency level, it would rate
>> higher than a quorum. One pays the price for a more complex write path to
>> obtain the requisite guarantee.
>>
>>
>> On Jun 21, 2010, at 4:03 AM, Rishi Bhardwaj wrote:
>>
>> >
>> > Heres another thought I had, if say the user always wrote with quorum (or
>> to all) nodes then can't we implement CAS (compare and swap) assuming that
>> user employs logical timestamp and Cassandra doesn't allow writes to a
>> column with same or older timestamp. Here's the scenario I am thinking
>> about:
>> > Say we use logical timestamp for a column value and lets assume the
>> current timestamp is t. Now say two clients read this column and generate
>> concurrent CAS (compare and swap) operations on timestamp t and for both the
>> writes the resulting new timestamp would become (t+1). Now if we don't allow
>> writes to a column with same timestamp then only one of these writes would
>> succeed. Of course another assumption is that if a third CAS write with
>> compare on logical timestamp (t - 1) came in, that would be denied as I
>> believe Cassandra doesn't allow "older" writes to win over "newer" writes.
>> Do you think such a thing can be accomplished?
>>
>>
>>
>>

Mime
View raw message