incubator-cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Malone <m...@simplegeo.com>
Subject Re: Atomic Compare and Swap
Date Wed, 23 Jun 2010 04:27:44 GMT
I'd be interested in what the folks who want CAS implementations think about
vector clocks. Can you use them to fulfill your use cases? If not, why not?

I ask because I have found myself wanting CAS in Cassandra too, but I think
that's only because I'm pretty familiar with HTTP. I think vector clocks
with client merge give you essentially the same functionality, but in a way
that fits much more nicely with the rest of the Cassandra architecture. CAS
really exacerbates Cassandra's weaknesses.

Mike

On Tue, Jun 22, 2010 at 4:52 PM, Rishi Bhardwaj <khichrishi@yahoo.com>wrote:

>
>
> S>: An *atomic* CAS is another beast and I see at least two difficulties:
>
> S>: 1) making it atomic locally: Cassandra's implementation is very much
> >multi-threaded. On a given node, while you're
> reading-comparing-and-swapping
> >on some column c, no other thread should be allowed to write c (even
> 'normal'
> >write). You would probably need to have specific column families where CAS
> is
> >allowed and for which all writes would be slower (since some locking would
> be
> >involved). Even then, making such locking efficient and right is not easy.
> But
> >in the end, local atomicity is quite probably the easy part.
>
> R: I am curious as to how does Cassandra handle two concurrent writes to
> the same column right now? Is there any locking on the write path to
> serialize two writes to the same column? If there is any locking then CAS
> can build on that. If there is no such locking then we could exclude normal
> writes from the synchronization/locking required for CAS. So the normal
> write path remains the same, and we let the client know that atomic CAS
> wouldn't work if normal writes are also happening on the same column values.
> In short a client should not mix normal writes with Atomic CAS for writing
> some column value. This will hopefully make things simpler.
>
> S:>2) making it atomic cluster-wide: data is replicated and an atomic CAS
> would
> >need to apply on the exact same column version in every node. Which, with
> >eventual consistency especially, is pretty hard to accomplish unless
> you're
> >locking the cluster (but that's what Cages/ZK do).
>
> R: For starters it would be great if atomic CAS could work for consistency
> level Quorum and ALL and not be supported for other consistency levels. Even
> for other consistency levels what would stop CAS to work? Why would one
> require cluster wide locking? I might be mistaken here but the atomic CAS
> operation would happen individually at all the replica nodes (either
> directly or through hinted writes) and would succeed or fail depending on
> the timestamp/version of the column at the replica. If we do Quorum reads
> and CAS writes then we can also be sure about consistency.
>
> S:>That being said, if you have a neat solution for efficient and
> distributed
> >atomic CAS that doesn't require rewriting 80% of Cassandra, I'm sure there
> >will be interest in that.
>
>
> R: That sounds great. I am definitely going to look into this and report
> back if I have a good solution.
>
>
> Thanks,
> Rishi
>
>
>
>
> ________________________________
> From: Sylvain Lebresne <sylvain@yakaz.com>
> To: dev@cassandra.apache.org
> Sent: Tue, June 22, 2010 1:21:51 AM
> Subject: Re: Atomic Compare and Swap
>
> On Mon, Jun 21, 2010 at 11:19 PM, Rishi Bhardwaj <khichrishi@yahoo.com>
> wrote:
> > I have read the post on cages and it is definitely very interesting. But
> > cages seems to be too coarse grained compared to an Atomic Compare and
> Swap
> > on Cassandra column value. Cages would makes sense when one wants to do
> > multiple atomic row, column updates. Also, I am not so sure about the
> > scalability when it comes to using zookeeper for keeping locks on
> Cassandra
> > columns... there would also be performance hit with an added RPC for
> every
> > write. I feel Cages maybe fine for systems when one has few locks but I
> feel
> > an atomic CAS in Cassandra would help us avoid distributed locking
> systems
> > and zookeeper in many other simpler scenarios. For more complicated
> > (transaction like) things, using Cages may be fine. Then again doing a
> read
> > before write for CAS in cassandra will make CAS at least as slow as a
> read,
> > which I believe will still be better than taking a single column lock
> from
> > zookeeper.
> >
> > What do other folks think in this regard? From whatever I have read, I
> > believe CAS is feasible in Cassandra without hurting the normal write
> path
> > performance. Only for CAS writes would we have to pay for the read before
> > write penalty. I am going to do feasibility study for this and would love
> > any pointers from others about this.
>
> Making a (non atomic) CAS is easy (doing it client side is fine, and there
> has been some discussion about 'callbacks' that may or may not someday
> allow
> to do that server-side).
>
> An *atomic* CAS is another beast and I see at least two difficulties:
>
> 1) making it atomic locally: Cassandra's implementation is very much
> multi-threaded. On a given node, while you're
> reading-comparing-and-swapping
> on some column c, no other thread should be allowed to write c (even
> 'normal'
> write). You would probably need to have specific column families where CAS
> is
> allowed and for which all writes would be slower (since some locking would
> be
> involved). Even then, making such locking efficient and right is not easy.
> But
> in the end, local atomicity is quite probably the easy part.
>
> 2) making it atomic cluster-wide: data is replicated and an atomic CAS
> would
> need to apply on the exact same column version in every node. Which, with
> eventual consistency especially, is pretty hard to accomplish unless you're
> locking the cluster (but that's what Cages/ZK do).
>
> That being said, if you have a neat solution for efficient and distributed
> atomic CAS that doesn't require rewriting 80% of Cassandra, I'm sure there
> will be interest in that.
>
> --
> Sylvain
>
> >
> > Thanks,
> > Rishi
> >
> >
> >
> > ________________________________
> > From: Rauan Maemirov <rauan@maemirov.com>
> > To: dev@cassandra.apache.org
> > Sent: Mon, June 21, 2010 11:27:02 AM
> > Subject: Re: Atomic Compare and Swap
> >
> > Have you read this post?
> >
> http://ria101.wordpress.com/2010/05/12/locking-and-transactions-over-cassandra-using-cages/
> > I guess, you will like it.
> >
> > 2010/6/22 Rishi Bhardwaj <khichrishi@yahoo.com>
> >
> >> I am definitely interested in taking this work up. I believe the CAS
> >> functionality would help in a lot of different scenarios and could help
> >> avoid use of other external services (like zookeeper) to provide similar
> >> functionality. I am new at Cassandra development and would really
> appreciate
> >> pointers from the dev. community about how to approach/start on this
> >> project. Also how feasible is the approach mentioned below to implement
> the
> >> CAS functionality? It would be great if we could have a discussion on
> the
> >> pros and cons.
> >>
> >> Thanks,
> >> Rishi
> >>
> >>
> >>
> >> ________________________________
> >> From: Sriram Srinivasan <sriram@malhar.net>
> >> To: dev@cassandra.apache.org
> >> Sent: Sun, June 20, 2010 9:47:37 PM
> >> Subject: Re: Atomic Compare and Swap
> >>
> >>
> >> I too am interested in a CAS facility.
> >>
> >> I like Rishi's proposal. One could simply use a version number as the
> >> logical timestamp. If we promote CAS to a consistency level, it would
> rate
> >> higher than a quorum. One pays the price for a more complex write path
> to
> >> obtain the requisite guarantee.
> >>
> >>
> >> On Jun 21, 2010, at 4:03 AM, Rishi Bhardwaj wrote:
> >>
> >> >
> >> > Heres another thought I had, if say the user always wrote with quorum
> (or
> >> to all) nodes then can't we implement CAS (compare and swap) assuming
> that
> >> user employs logical timestamp and Cassandra doesn't allow writes to a
> >> column with same or older timestamp. Here's the scenario I am thinking
> >> about:
> >> > Say we use logical timestamp for a column value and lets assume the
> >> current timestamp is t. Now say two clients read this column and
> generate
> >> concurrent CAS (compare and swap) operations on timestamp t and for both
> the
> >> writes the resulting new timestamp would become (t+1). Now if we don't
> allow
> >> writes to a column with same timestamp then only one of these writes
> would
> >> succeed. Of course another assumption is that if a third CAS write with
> >> compare on logical timestamp (t - 1) came in, that would be denied as I
> >> believe Cassandra doesn't allow "older" writes to win over "newer"
> writes.
> >> Do you think such a thing can be accomplished?
> >>
> >>
> >>
> >>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message