cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: Best Practise for Updating Index and Reporting Tables
Date Sat, 25 Jul 2015 11:07:31 GMT
Hi Robert,


I think, I have fair understanding of atomic batches. By synchronously, I meant client making
a blocking atomic batch call to execute all table updates in one go. I understand that Cassandra
statements will be executed as per CL.


I want to know how people generally handle a scenario when there is just 1 transaction table
but there is overhead of updating multiple manually created index tables and reeporting tables. 


Do they go with atomic batches which have some performance cost 

OR

They just update the transaction table and the responsibility of maintaining consistency of
transaction table with index and reporting table lies with clients and txn table updates are
not blocking till other index/reporting tables are updated?

OR 

There are better ways to deal with the scenario where u have one txn table,3 index tables
and 2 reporting tables?


I think atomic batches takes the headache of "atomicity" to the server and thus data consistency
can be maintained. Bothered about the cost and any negatives,especially when 5 writes are
required for every transaction table write.


Thanks

Anuj


Sent from Yahoo Mail on Android

From:"Robert Wille" <rwille@fold3.com>
Date:Fri, 24 Jul, 2015 at 12:03 am
Subject:Re: Best Practise for Updating Index and Reporting Tables

My guess is that you don’t understand what an atomic batch is, give that you used the phrase
“updated synchronously”. Atomic batches do not provide isolation, and do not guarantee
immediate consistency. The only thing an atomic batch guarantees is that all of the statements
in the batch will eventually be executed. Both approaches are eventually consistent, so you
have to deal with inconsistency either way. 


On Jul 23, 2015, at 11:46 AM, Anuj Wadehra <anujw_2003@yahoo.co.in> wrote:


We have a transaction table,3 manually created index tables and few tables for reporting. 



One option is to go for atomic batch mutations so that for each transaction every index table
and other reporting tables are updated synchronously. 


Other option is to update other tables async, there may be consistency issues if some mutations
drop under load or node goes down. Logic for rolling back or retrying idempodent updates will
be at client.


We dont have a persistent queue in the system yet and even if we introduce one so that transaction
table is updated and other updates are done async via queue, we are bothered about its throughput
as we go for around 1000 tps in large clusters. We value consistency but small delay in updating
index and reporting table is acceptable.


Which design seems more appropriate?


Thanks

Anuj

Sent from Yahoo Mail on Android



Mime
View raw message