incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Multi Master replication : rejoining a node after split network
Date Thu, 19 Apr 2012 10:18:25 GMT
For background:
http://www.datastax.com/docs/1.0/cluster_architecture/index
http://thelastpickle.com/2011/02/07/Introduction-to-Cassandra/

> Which mechanism is used to replicate the changes from one system to another: statement
distribution or recording the changeset via triggers or storing the changeset in transaction
log?
Statement distributions is the closest to the truth. But we do not distribute statements.
Check the links above, the coordinator processes the requests and sends messages to all the
replicas at the same time. From the RDBMS world it's akin to Mirroring in the SQL Server /
Oracle world. 

> Since replication is continuous copying of changes from one node to another, these changes
would have to be snapshotted in order to sustain temporary network failures so that replication
can resume after the network problem is healed. is there a mechanism to define how long we
can store/archive the snaphotted changes before we discard and would demand a recreation of
node from the scratch rather than rejoin
Snapshotting is not used. Look at the Consistency Level, Read Repair, Hinted Handoff and Repair.

 
> What options are available for conflict resolution since we are talking about master-master
replication across tens of nodes?
An int64  time stamp which is specified by the client or the server (when using CQL). By convention
microseconds past the epoch are used.
 
> If a node is rejoined after a split network where same records would have been modified
on multiple nodes, is there a mechanism to merge the data, resolve conflicts and eventually
reach to a consistent state?
See above. It's all part of the Eventual Consistency world. nodetool repair is the final word
in repairing data. But the Consistency Level is what specifies the guarantee per request.


The best way to learn is to jump in and play with it. 

Cheers


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 18/04/2012, at 3:36 AM, Samba wrote:

> Hi all,
> We are evaluating Cassandra for a geographically distributed deployment that requires
multi master replication.
> 
> We have a few questions regarding how replication is handled in Cassandra, like:
> 
> Which mechanism is used to replicate the changes from one system to another: statement
distribution or recording the changeset via triggers or storing the changeset in transaction
log?
> Since replication is continuous copying of changes from one node to another, these changes
would have to be snapshotted in order to sustain temporary network failures so that replication
can resume after the network problem is healed. is there a mechanism to define how long we
can store/archive the snaphotted changes before we discard and would demand a recreation of
node from the scratch rather than rejoin
> What options are available for conflict resolution since we are talking about master-master
replication across tens of nodes?
> If a node is rejoined after a split network where same records would have been modified
on multiple nodes, is there a mechanism to merge the data, resolve conflicts and eventually
reach to a consistent state?
> Thanks and Regards,
> Samba


Mime
View raw message