cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergey Tryuber <>
Subject Re: Replication factor 2, consistency and failover
Date Wed, 12 Sep 2012 08:26:40 GMT
Aaron, thank you! Your message was exactly what we wanted to see: that we
didn't miss something critical. We'll share our Astyanax patch in the

On 10 September 2012 03:44, aaron morton <> wrote:

> In general we want to achieve strong consistency.
> You need to have R + W > N
> LOCAL_QUORUM and reads with ONE.
> Gives you 2  + 1 > 2 when you use it. When you drop back to ONE / ONE you
> no longer have strong consistency.
> may be advise on how to improve it.
> Sounds like you know how to improve it :)
> Things you could play with:
> * hinted_handoff_throttle_delay_in_ms in YAML to reduce the time it takes
> for HH delay to deliver the messages.
> * increase the read_repair_chance for the CF's. This will increase the
> chance of RR repairing an inconsistency behind the scenes, so the next read
> is consistent. This will also increase the IO load on the system.
> With the RF 2 restriction you are probably doing the best you can. You are
> giving up consistency for availability and partition tolerance. The best
> thing to do to get peeps to agree that "we will accept reduced consistency
> for high availability" rather than say "in general we want to achieve
> strong consistency".
> Hope that helps.
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> On 9/09/2012, at 9:09 PM, Sergey Tryuber <> wrote:
> Hi
> We have to use Cassandra with RF=2 (don't ask why...). There are two
> datacenters (RF=2 in each datacenter). Also we use Astyanax as a client
> library. In general we want to achieve strong consistency. Read performance
> is important for us, that's why we perform writes with LOCAL_QUORUM and
> reads with ONE. If one server is down, we automatically switch to
> Writes.ONE, Reads.ONE only for that replica which has failed node (we
> modified Astyanax to achieve that). When the server comes back, we turn
> back Writes.LOCAL_QUORUM and Reads.ONE, but, of course, we see some
> inconsistencies during the switching process and some time after (when
> hinted handnoff works).
> Basically I don't have any questions, just want to share our "ugly"
> failover algorithm, to hear your criticism and may be advise on how to
> improve it. Unfortunately we can't change replication factor and most of
> the time we have to read with consistency level ONE (because we have strict
> requirements on read performance).
> Thank you!

View raw message