cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "T Jake Luciani (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13442) Support a means of strongly consistent highly available replication with storage requirements approximating RF=2
Date Tue, 18 Apr 2017 02:25:41 GMT


T Jake Luciani commented on CASSANDRA-13442:

.bq This ticket is not about changing consistency levels and doesn't require applications
to change their usage of consistency levels to benefit.

7168 added a new CL as a way to opt-in to this new feature. Once its fully vetted it would
be trivial to make it automatically use it when appropriate.

bq.  7168 also does not have reducing storage requirements as a goal.

This idea seem risky to me. At the least it should be opt-in.  From a operators perspective
you would need to consider how to handle bootstrapping or replacing a node. Also, how to handle
backup and restores, etc.

The topology of the cluster would also have a new dimension that the drivers would need to
consider.  Since for CL.ONE queries you would need to only use one of the replicas with all
the data on it.

> Support a means of strongly consistent highly available replication with storage requirements
approximating RF=2
> ----------------------------------------------------------------------------------------------------------------
>                 Key: CASSANDRA-13442
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Compaction, Coordination, Distributed Metadata, Local Write-Read
>            Reporter: Ariel Weisberg
> Replication factors like RF=2 can't provide strong consistency and availability because
if a single node is lost it's impossible to reach a quorum of replicas. Stepping up to RF=3
will allow you to lose a node and still achieve quorum for reads and writes, but requires
committing additional storage.
> The requirement of a quorum for writes/reads doesn't seem to be something that can be
relaxed without additional constraints on queries, but it seems like it should be possible
to relax the requirement that 3 full copies of the entire data set are kept. What is actually
required is a covering data set for the range and we should be able to achieve a covering
data set and high availability without having three full copies. 
> After a repair we know that some subset of the data set is fully replicated. At that
point we don't have to read from a quorum of nodes for the repaired data. It is sufficient
to read from a single node for the repaired data and a quorum of nodes for the unrepaired
> One way to exploit this would be to have N replicas, say the last N replicas (where N
varies with RF) in the preference list, delete all repaired data after a repair completes.
Subsequent quorum reads will be able to retrieve the repaired data from any of the two full
replicas and the unrepaired data from a quorum read of any replica including the "transient"

This message was sent by Atlassian JIRA

View raw message