cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "sankalp kohli (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-7168) Add repair aware consistency levels
Date Tue, 21 Apr 2015 21:16:02 GMT


sankalp kohli commented on CASSANDRA-7168:

cc [~krummas]
I agree with [~slebresne] that we first need to make sure last repair time is consistent across
replicas CASSANDRA-9143. 
There is lot of overlap here between this ticket and CASSANDRA-6434 but I chose this ticket
to comment since there is lot of discussions here :). 
CASSANDRA-6434 will only drop tombstones from the repaired data. The problem with this is
that if repair time could not be sent to one replica with CASSANDRA-9143, it will not drop
tombstone for the data which other replicas will. 
Now during a normal read or repair consistency read, this replica which did not get the repair
time will include some tombstones which other replicas won't. This is due to different view
of what is repaired and what is not. This will cause digest mismatch leading to spike in latency.

We also cannot use Benedict approach of finding the last common repair time since replicas
which are ahead would have compacted there tombstones leading to the same problem of digest

I think we need to do  CASSANDRA-9143 and also only drop tombstones when we are sure all replicas
has that repair time. 

Also during the time when replicas are getting the message that these set of stables are repaired
and don't include the tombstones from them in read and start dropping tombstones if eligible,
this is not going to be done at same time across replicas. This will cause digest mismatch
during this time which is not ideal. 

I have not yet thought through how this could be avoided. 

> Add repair aware consistency levels
> -----------------------------------
>                 Key: CASSANDRA-7168
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: T Jake Luciani
>              Labels: performance
>             Fix For: 3.1
> With CASSANDRA-5351 and CASSANDRA-2424 I think there is an opportunity to avoid a lot
of extra disk I/O when running queries with higher consistency levels.  
> Since repaired data is by definition consistent and we know which sstables are repaired,
we can optimize the read path by having a REPAIRED_QUORUM which breaks reads into two phases:
>   1) Read from one replica the result from the repaired sstables. 
>   2) Read from a quorum only the un-repaired data.
> For the node performing 1) we can pipeline the call so it's a single hop.
> In the long run (assuming data is repaired regularly) we will end up with much closer
to CL.ONE performance while maintaining consistency.
> Some things to figure out:
>   - If repairs fail on some nodes we can have a situation where we don't have a consistent
repaired state across the replicas.  

This message was sent by Atlassian JIRA

View raw message