cassandra-client-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oleg Dulin <oleg.du...@liquidanalytics.com>
Subject Re: Implementing a locking mechanism - RFC
Date Thu, 31 Jan 2013 14:23:53 GMT
The problem with using Cassandra for locking is that if you ever have read consistency issues
(which you will unless you use ALL) you will have inconsistent values.

In general I would avoid doing a read-before-write with Cassandra. I would come up with another
way to update my data. 

Regards,
Oleg Dulin
Please note my new office #: 732-917-0159

On Jan 31, 2013, at 9:19 AM, Daniel Godás <dgodas@gmail.com> wrote:

> Ok, I've done some reading. If I understood it correctly the idea
> would be to send messages to the queue that contain a transaction i.e.
> a list of CQL commands to be run atomically. When one of the consumers
> gets the message it can run the transaction atomically before allowing
> another consumer to get the next message. If this is correct then in
> order to handle cases in which I need to interleave code with the CQL
> statements e.g. to check retrieved values, I need to implement a
> protocol that uses the message queue as a locking mechanism. How is
> this better than using cassandra for locking? (using the algorithm I
> proposed or another one).
> 
> 2013/1/31 Oleg Dulin <oleg.dulin@liquidanalytics.com>:
>> This may help:
>> 
>> http://activemq.apache.org/how-do-distributed-queues-work.html
>> 
>> http://activemq.apache.org/topologies.html
>> 
>> http://activemq.apache.org/how-do-i-embed-a-broker-inside-a-connection.html
>> 
>> Although I would use ActiveMQ spring configuration, not write code. But the point
is -- you can have multiple processes participating in an ActiveMQ federation; you can configure
AMQ's fault tolerance profiles to your liking without having to set up a yet another server
with a single point of failure.
>> 
>> You have a single distributed queue. Each process has a writer consumer on that queue.
AMQ knows to load balance, only one consumer at a time gets to write. Instead of writing to
cassandra, you send your data item to the queue. The next available consumer gets the message
and writes it -- all in the order of messages on the queue, and only one consumer writer at
a time.
>> 
>> Regards,
>> Oleg Dulin
>> Please note my new office #: 732-917-0159
>> 
>> On Jan 31, 2013, at 8:11 AM, Daniel Godás <dgodas@gmail.com> wrote:
>> 
>>> Sounds good, I'll try it out. Thanks for the help.
>>> 
>>> 2013/1/31 Oleg Dulin <oleg.dulin@liquidanalytics.com>:
>>>> Use embedded amq brokers , no need set up any servers  . It literally is
>>>> one line of code to turn it on, and 5 lines of code to implement what you
>>>> want.
>>>> 
>>>> We have a cluster of servers writing to Cassandra this way and we are not
>>>> using any j2ee containers.
>>>> 
>>>> On Thursday, January 31, 2013, Daniel Godás wrote:
>>>> 
>>>>> Doesn't that require you to set up a server for the message queue and
>>>>> know it's address? That sort of defeats the purpose of having a
>>>>> database like cassandra in which all nodes are equal and there's no
>>>>> single point of failure.
>>>>> 
>>>>> 2013/1/31 Oleg Dulin <oleg.dulin@liquidanalytics.com <javascript:;>>:
>>>>>> Use a JMS message queue to send objects you want to write. Your writer
>>>>> process then will listen on this queue and write to Cassandra. This ensures
>>>>> that all writes happen in an orderly fashion, one batch at a time.
>>>>>> 
>>>>>> I suggest ActiveMQ. It is easy to set up. This is what we use for
this
>>>>> type of a use case. No need to overcomplicate this with Cassandra.
>>>>>> 
>>>>>> 
>>>>>> Regards,
>>>>>> Oleg Dulin
>>>>>> Please note my new office #: 732-917-0159
>>>>>> 
>>>>>> On Jan 31, 2013, at 6:35 AM, Daniel Godás <dgodas@gmail.com<javascript:;>>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi all,
>>>>>>> 
>>>>>>> I need a locking mechanism on top of cassandra so that multiple
>>>>>>> clients can protect a critical section. I've seen some attempts,
>>>>>>> including Dominic Williams' wait chain algorithm but I think
it can be
>>>>>>> simplified. This is the procedure I wrote to implement a simple
mutex.
>>>>>>> Note that it hasn't been thoroughly tested and I have been using
>>>>>>> cassandra for a very short time so I'd appreciate any comments
on
>>>>>>> obvious errors or things I'm doing plain wrong and will never
work.
>>>>>>> 
>>>>>>> The assumptions and requirements for the algorithm are the same
as
>>>>>>> Dominic Williams'
>>>>>>> (
>>>>> http://media.fightmymonster.com/Shared/docs/Wait%20Chain%20Algorithm.pdf).
>>>>>>> 
>>>>>>> We will create a column family for the locks referred to as "locks"
>>>>>>> throughout this procedure. The column family contains two columns;
an
>>>>>>> identifier for the lock  which will also be the column key ("id")
and
>>>>>>> a counter ("c"). Throughout the procedure "my_lock_id" will be
used as
>>>>>>> the lock identifier. An arbitrary time-to-live value is required
by
>>>>>>> the algorithm. This value will be referred to as "t". Choosing
an
>>>>>>> appropriate value for "t" will be postponed until the algorithm
is
>>>>>>> deemed good.
>>>>>>> 
>>>>>>> === begin procedure ===
>>>>>>> 
>>>>>>> (A) When a client needs to access the critical section the following
>>>>>>> steps are taken:
>>>>>>> 
>>>>>>> --- begin ---
>>>>>>> 
>>>>>>> 1) SELECT c FROM locks WHERE id = my_lock_id
>>>>>>> 2) if c = 0 try to acquire the lock (B), else don't try (C)
>>>>>>> 
>>>>>>> --- end ---
>>>>>>> 
>>>>>>> (B) Try to acquire the lock:
>>>>>>> 
>>>>>>> --- begin ---
>>>>>>> 
>>>>>>> 1) UPDATE locks USING TTL t SET c = c + 1 WHERE id = my_lock_id
>>>>>>> 2) SELECT c FROM locks WHERE id = my_lock_id
>>>>>>> 3) if c = 1 we acquired the lock (D), else we didn't (C)
>>>>>>> 
>>>>>>> --- end ---
>>>>>>> 
>>>>>>> (C) Wait before re-trying:
>>>>>>> 
>>>>>>> --- begin ---
>>>>>>> 
>>>>>>> 1) sleep for a random time higher than t and start at (A) again
>>>>>>> 
>>>>>>> --- end ---
>>>>>>> 
>>>>>>> (D) Execute the critical section and release the lock:
>>>>>>> 
>>>>>>> --- begin ---
>>>>>>> 
>>>>>>> 1) start background thread that increments c with TTL = t every
t / 2
>>>>>>> interval (UPDATE locks USING TTL t SET c = c + 1 WHERE id =
>>>>>>> my_lock_id)
>>>>>>> 2) execute the critical section
>>>>>>> 3) kill background thread
>>>>>>> 4) DELETE * FROM locks WHERE id = my_lock_id
>>>>>>> 
>>>>>>> --- end ---
>>>>>>> 
>>>>>>> === end procedure ===
>>>>>>> 
>>>>>>> Looking forward to read your comments,
>>>>>>> Dan
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> --
>>>> Sent from Gmail Mobile
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message