Hi All,

Can someone please validate and recommend a solution for the given design problem?

Problem statement: Need to de-queue data from Cassandra (from Standard ColumnFamily) using a job but multiple instances of a job can run simultaneously (kinda multiple threads), trying to access a same row but need to make sure that only one instance of a job (thread) can access a row, meaning if job A is accessing Row #1, then job B can't access Row #1.

Possible solutions:

Solution #1: Using Cages (and ZooKeeper) to make sure that one only job at a time can access a row in CF. How do we make sure that Cages (transaction coordinator using ZooKeeper) is not a Single Point of Failure? What is the performance impact on write/read on nodes? There is some blog on distributed concurrent queue at http://www.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/

Solution #2: Using some home-grown approach to store/maintain who is accessing what, meaning which job is accessing which row.

Are there any other solutions to the above problem?  

Can someone please help me on validate the design?

Mubarak Seyed.