cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Cluster temporarily split into segments
Date Sun, 26 Aug 2012 22:05:55 GMT
> using CL=ONE (read) and CL=ALL(write).
Using this setting you are saying the application should fail in the case of a network partition.
You are valuing Consistency and Availability over  Partition Tolerance. Mixing the CL levels
in response to a partition will make it difficult to reason about the consistency of the data.
Consider other approaches such as CL QUORUM.

While using CL ONE for read looks good. If you require strong consistency you have to use
ALL for writes. QUORUM for reads and writes may be a better choice. Using RF 3 and 6 nodes
would give you a pretty good availability in the face of node failures (for background http://thelastpickle.com/2011/06/13/Down-For-Me/
) 

Or relax the Consistency and us CL ONE for reads and CL QUORUM for writes. The writes are
still sent to RF nodes. But we can no longer guarantee reads will see them. 

If the high RF is for 
> Suppose that connectivity breaks down (for whatever reason) causing two isolated segments:
> S1 = {A,B,C,D} and S2 = {E,F}.

Do clients still have access to the entire cluster ? Normally we would expect clients to try
different nodes until they either fail or find a partition with enough UP nodes to service
the request.

> to be able to write at all, the CL strategy definitely needs to be changed.
> In S1, for instance change to CL=QUORUM for both reads/writes
> In S2, CL(write) change to TWO/ONE/ANY. CL(read) may be changed to TWO
Whatever the choice you can imagine a partition where the only thing that works for writes
is CL ONE. e.g. if it split 3/3 QUOURM would not work. 

> So now to the interesting question, what happens when S1 and S2 reestablish full connectivity
again ?
If you are using CL ALL for writes the easiest things to do is stop writing when the cluster
partitions. And resume when it comes back. 

If you drop the CL during writes reads will be inconsistent until either HH has finished or
you run repairs. 


>  It is extremly important that reads will continue to operate in both S1 and S2
if it's important that reads continue and are Consistent, I would look at RF 3 with QUOURM
/ QUOURM. 

If it's important that reads continue and consistency can be relaxed I would look at RF 3
(or 6) and read ONE write QUOURM

Hope that helps. 
  
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/08/2012, at 11:00 PM, Robert Hellmans <robert.hellmans@aastra.com> wrote:

> Hi !
>  
> I'm preparing the test below. I've found a lot of information about deadnode replacements
and adding extra nodes to increase capacity, but didn't find anything about this segementation
issue. Anyone that can share experience/ideas ?
>  
>  
> Setup:
> Cluster with 6 nodes {A,B,C,D,E,F}, RF=6, using CL=ONE (read) and CL=ALL(write).
>  
>  
> Suppose that connectivity breaks down (for whatever reason) causing two isolated segments:
> S1 = {A,B,C,D} and S2 = {E,F}.
>  
> Cluster connectivity anomalities will be detected by all nodes in this setup, so clients
in S1 and S2 can be advised
> to change their CL strategy. It is extremly important that reads will continue to operate
in both S1 and S2
> and I don't see any reason why they shouldn't. It is almost that important that writes
in each segment can continue, but
> to be able to write at all, the CL strategy definitely needs to be changed.
> In S1, for instance change to CL=QUORUM for both reads/writes
> In S2, CL(write) change to TWO/ONE/ANY. CL(read) may be changed to TWO
>  
> During the connectivity breakdown, clients in both S1 and S2 simultaneously change/add/delete
data.
>  
>  
>  
> So now to the interesting question, what happens when S1 and S2 reestablish full connectivity
again ?
> Again, the re-connectivity event will be detected, so should I trig some special repair
sequence ?
> Or should I've been doing some actions already when the connectivity broke ?
> What about connectivity dropout time, longer/shorter than max_hint_window ?
>  
>  
>  
>  
> Rds /Robert
>  
>  
>  


Mime
View raw message