However with your design there are still corner c= ases for 2 consumers to read from the same queue. Reading and writing with = QUORUM does not prevent race conditions. I believe the new CAS feature of C= * 2.0 might be useful here but with the expense of reduced throughput (beca= use of the Paxos round)

I have actually been b= uilding something similar in my space time. You can hang around and wait fo= r it or build your own. Here is the basics. Not perfect but it will work.
Create column family queue with gc_grace_period=3D[1 day]

set queue [timeuuid()] ["z"+timeuuid()] =3D [ work= do do]

The producer can decide how it wants to role ove= r the row key and the column key it does not matter.

Supposing there are N consumers. We need a way for the consumers= to not do the same work. We can use something like the bakery algorithm. R= emember at QUORUM a reader sees writes.

A consumer needs = an identifier (it could be another uuid or an ip address)
A consumer calls get_range_slice on the queue the slice is from = new byte[] to byte[] limit 100

The consumer se= es data like this.

[1234] [z-\$timeuuid] =3D data

Now we register that this consumer wants to consume this queue
set [1234] [a-\$[ip}] at quorum

Now we do= a slice
get_slice [1234]=A0 from new byte [] to ' b'=

There are a few possible returns.
1) 1 bidder.= ..
[1234] [a-\$myip]
You won start consuming

2)=A0 2 bidders
[1234] [a-\$myip]
[1234] [= a-\$otherip]
compare \$myip vs \$otherip higher wins

Whoever = wins can then start consuming the columns in the queue and delete them when= done.

>= Thanks Nat for your ideas.
>>This could be as simple as adding year and month to the primary key= (in the form >'yyyymm'). Alternatively, you could add this in t= he partition in the definition. Either way, it >then becomes pretty easy= to re-generate these based on the query parameters.=A0
>
> =A0The thing is that it's not that simple. My customer has= a very BAD idea, using Cassandra as a queue (the perfect anti-pattern ever= ).
> =A0Before trying to tell them to redesign their entire architect= ure and put in some queueing system like ActiveMQ or something similar, I w= ould like to see how I can use wide rows to meet the requirements.
> =A0The functional need is quite simple:
> =A01) A process A load= s users into Cassandra and sets the status on this user to be 'TODO'= ;. When using the bucketing technique, we can limit a row width to, let'= ;s say 100 000 columns. So at the end of the current row, process A knows t= hat it should move to next bucket. Bucket is coded using composite partitio= n key, in our example it would be 'TODO:1', 'TODO:2' .... e= tc
>
> =A02) A process B reads the wide row for 'TODO' status= . It starts at bucket 1 so it will read row with partition key 'TODO:1&= #39;. The users are processed and inserted in a new row 'PROCESSED:1= 9; for example to keep track of the status. After retrieving 100 000 column= s, it will switch automatically to the next bucket. Simple. Fair enough
>
> =A03) Now what sucks it that some time, process B does not hav= e enough data to perform functional logic on the user it fetched from the w= ide row, so it has to REPUT some users back into the 'TODO' status = rather than transitioning to 'PROCESSED' status. That's exactly= a queue behavior.
> =A0A simplistic idea would be to insert again those m users with '= TODO:n', with n higher than the current bucket number so it can be proc= essed later. But then it screws up all the counting system. Process A which= inserts data will not know that there are already m users in row n, so wil= l happily=A0add 100 000 columns, making the row size grow to =A0100 000 + m= . When process B reads back again this row, it will stop at the first 100 0= 00 columns and skip the trailing=A0m elements .
> =A0 That 's the main reason for which I dropped the idea of bucket= ing (which is quite smart in normal case) to trade for ultra wide row.
&= gt; =A0Any way, I'll follow your advice and play around with the parame= ters of SizeTiered
> =A0Regards
> =A0Duy Hai DOAN
>
>>>
>= ;>> =A0The only drawback for ultra wide row I can see is point 1). Bu= t if I use leveled compaction with a sufficiently large value for "sst= able_size_in_mb" (let's say 200Mb), will my read performance be im= pacted as the row grows ?
>>
>> For this use case, you would want to use SizeTieredCom= paction and play around with the configuration a bit to keep a small number= of large SSTables. Specifically: keep min|max_threshold really low, set bu= cket_low and bucket_high closer together maybe even both to 1.0, and maybe = a larger min_sstable_size.=A0
>> YMMV though - per Rob's suggestion, take the time to run some = tests tweaking these options.
>> =A0
>>>
>>&g= t; =A0Of course, splitting wide row into several rows using bucketing techn= ique is one solution but it forces us to keep track of the bucket number an= d it's not convenient. We have one process (jvm) that insert data and a= nother process (jvm) that read data. Using bucketing, we need to synchroniz= e the bucket number between the 2 processes.
>>
>> This could be as simple as adding year and month to th= e primary key (in the form 'yyyymm'). Alternatively, you could add = this in the partition in the definition. Either way, it then becomes pretty= easy to re-generate these based on the query parameters. =A0
>> =A0
>>
>

