cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8272) 2ndary indexes can return stale data
Date Mon, 22 May 2017 19:32:04 GMT


Sylvain Lebresne commented on CASSANDRA-8272:

I can refer you to [my comment on 8273|]
on that aspect since it's largely the same aspect, but the short version is that you are right,
but at the same time 1) anyone caring about performance should probably use token-aware clients
and that make such CL.ONE optimization not really needed, 2) we actually don't ship the consistency
with {{ReadCommand}} to replicas, so such optimization cannot be done before 4.0 at best (it
requires a protocol change) and 3) the reason why this ticket is "easier" than CASSANDRA-8723
is that we will likely not transfer _that_ much additional rows: we'll only sent row that
use to be valid entries for the query but aren't anymore and this since less than gc_grace,
which, except maybe for very specific cases, is unlikely for represent a huge overhead for
any particular query.

Anyway, not really opposing such optimization, but given it's not useful if you follow performance
best practices and given it's a tiny bit more involved than it sounds (and it adds a bit of
complexity to the code after all by creating a special case), I'd be happy focusing on correction
here and leaving that to a follow-up. 

> 2ndary indexes can return stale data
> ------------------------------------
>                 Key: CASSANDRA-8272
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 3.0.x
> When replica return 2ndary index results, it's possible for a single replica to return
a stale result and that result will be sent back to the user, potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before having applied
the insert, then the now stale result will be returned (since C will return it and A or B
will return nothing).
> A potential solution would be that when we read a tombstone in the index (and provided
we make the index inherit the gcGrace of it's parent CF), instead of skipping that tombstone,
we'd insert in the result a corresponding range tombstone.  

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message