cassandra-client-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Godás <dgo...@gmail.com>
Subject Re: Implementing a locking mechanism - RFC
Date Thu, 31 Jan 2013 14:19:58 GMT
Ok, I've done some reading. If I understood it correctly the idea
would be to send messages to the queue that contain a transaction i.e.
a list of CQL commands to be run atomically. When one of the consumers
gets the message it can run the transaction atomically before allowing
another consumer to get the next message. If this is correct then in
order to handle cases in which I need to interleave code with the CQL
statements e.g. to check retrieved values, I need to implement a
protocol that uses the message queue as a locking mechanism. How is
this better than using cassandra for locking? (using the algorithm I
proposed or another one).

2013/1/31 Oleg Dulin <oleg.dulin@liquidanalytics.com>:
> This may help:
>
> http://activemq.apache.org/how-do-distributed-queues-work.html
>
> http://activemq.apache.org/topologies.html
>
> http://activemq.apache.org/how-do-i-embed-a-broker-inside-a-connection.html
>
> Although I would use ActiveMQ spring configuration, not write code. But the point is
-- you can have multiple processes participating in an ActiveMQ federation; you can configure
AMQ's fault tolerance profiles to your liking without having to set up a yet another server
with a single point of failure.
>
> You have a single distributed queue. Each process has a writer consumer on that queue.
AMQ knows to load balance, only one consumer at a time gets to write. Instead of writing to
cassandra, you send your data item to the queue. The next available consumer gets the message
and writes it -- all in the order of messages on the queue, and only one consumer writer at
a time.
>
> Regards,
> Oleg Dulin
> Please note my new office #: 732-917-0159
>
> On Jan 31, 2013, at 8:11 AM, Daniel Godás <dgodas@gmail.com> wrote:
>
>> Sounds good, I'll try it out. Thanks for the help.
>>
>> 2013/1/31 Oleg Dulin <oleg.dulin@liquidanalytics.com>:
>>> Use embedded amq brokers , no need set up any servers  . It literally is
>>> one line of code to turn it on, and 5 lines of code to implement what you
>>> want.
>>>
>>> We have a cluster of servers writing to Cassandra this way and we are not
>>> using any j2ee containers.
>>>
>>> On Thursday, January 31, 2013, Daniel Godás wrote:
>>>
>>>> Doesn't that require you to set up a server for the message queue and
>>>> know it's address? That sort of defeats the purpose of having a
>>>> database like cassandra in which all nodes are equal and there's no
>>>> single point of failure.
>>>>
>>>> 2013/1/31 Oleg Dulin <oleg.dulin@liquidanalytics.com <javascript:;>>:
>>>>> Use a JMS message queue to send objects you want to write. Your writer
>>>> process then will listen on this queue and write to Cassandra. This ensures
>>>> that all writes happen in an orderly fashion, one batch at a time.
>>>>>
>>>>> I suggest ActiveMQ. It is easy to set up. This is what we use for this
>>>> type of a use case. No need to overcomplicate this with Cassandra.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Oleg Dulin
>>>>> Please note my new office #: 732-917-0159
>>>>>
>>>>> On Jan 31, 2013, at 6:35 AM, Daniel Godás <dgodas@gmail.com<javascript:;>>
>>>> wrote:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I need a locking mechanism on top of cassandra so that multiple
>>>>>> clients can protect a critical section. I've seen some attempts,
>>>>>> including Dominic Williams' wait chain algorithm but I think it can
be
>>>>>> simplified. This is the procedure I wrote to implement a simple mutex.
>>>>>> Note that it hasn't been thoroughly tested and I have been using
>>>>>> cassandra for a very short time so I'd appreciate any comments on
>>>>>> obvious errors or things I'm doing plain wrong and will never work.
>>>>>>
>>>>>> The assumptions and requirements for the algorithm are the same as
>>>>>> Dominic Williams'
>>>>>> (
>>>> http://media.fightmymonster.com/Shared/docs/Wait%20Chain%20Algorithm.pdf).
>>>>>>
>>>>>> We will create a column family for the locks referred to as "locks"
>>>>>> throughout this procedure. The column family contains two columns;
an
>>>>>> identifier for the lock  which will also be the column key ("id")
and
>>>>>> a counter ("c"). Throughout the procedure "my_lock_id" will be used
as
>>>>>> the lock identifier. An arbitrary time-to-live value is required
by
>>>>>> the algorithm. This value will be referred to as "t". Choosing an
>>>>>> appropriate value for "t" will be postponed until the algorithm is
>>>>>> deemed good.
>>>>>>
>>>>>> === begin procedure ===
>>>>>>
>>>>>> (A) When a client needs to access the critical section the following
>>>>>> steps are taken:
>>>>>>
>>>>>> --- begin ---
>>>>>>
>>>>>> 1) SELECT c FROM locks WHERE id = my_lock_id
>>>>>> 2) if c = 0 try to acquire the lock (B), else don't try (C)
>>>>>>
>>>>>> --- end ---
>>>>>>
>>>>>> (B) Try to acquire the lock:
>>>>>>
>>>>>> --- begin ---
>>>>>>
>>>>>> 1) UPDATE locks USING TTL t SET c = c + 1 WHERE id = my_lock_id
>>>>>> 2) SELECT c FROM locks WHERE id = my_lock_id
>>>>>> 3) if c = 1 we acquired the lock (D), else we didn't (C)
>>>>>>
>>>>>> --- end ---
>>>>>>
>>>>>> (C) Wait before re-trying:
>>>>>>
>>>>>> --- begin ---
>>>>>>
>>>>>> 1) sleep for a random time higher than t and start at (A) again
>>>>>>
>>>>>> --- end ---
>>>>>>
>>>>>> (D) Execute the critical section and release the lock:
>>>>>>
>>>>>> --- begin ---
>>>>>>
>>>>>> 1) start background thread that increments c with TTL = t every t
/ 2
>>>>>> interval (UPDATE locks USING TTL t SET c = c + 1 WHERE id =
>>>>>> my_lock_id)
>>>>>> 2) execute the critical section
>>>>>> 3) kill background thread
>>>>>> 4) DELETE * FROM locks WHERE id = my_lock_id
>>>>>>
>>>>>> --- end ---
>>>>>>
>>>>>> === end procedure ===
>>>>>>
>>>>>> Looking forward to read your comments,
>>>>>> Dan
>>>>>
>>>>
>>>
>>>
>>> --
>>> Sent from Gmail Mobile
>

Mime
View raw message