cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8272) 2ndary indexes can return stale data
Date Thu, 08 Jun 2017 15:46:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16042891#comment-16042891
] 

Andrés de la Peña commented on CASSANDRA-8272:
----------------------------------------------

It seems that there were some thrift-related dtests failing in the path for 3.11. While fixing
them I have realized that Thrift commands send only the fetched columns, so they may not send
the queried columns that are required to apply the index filter in the coordinator side. The
indexed column value are actually fetched but filtered [here|https://github.com/apache/cassandra/blob/cassandra-3.11/src/java/org/apache/cassandra/index/internal/keys/KeysSearcher.java#L182-L193]
since [CASSANDRA-11523|https://issues.apache.org/jira/browse/CASSANDRA-11523]. So we could
just move this filter to the coordinator-side, probably to [ReadCommand#postReconciliationProcessing|https://github.com/adelapena/cassandra/blob/8272-3.11/src/java/org/apache/cassandra/db/ReadCommand.java#L447-L464].


However, I think that if we do such replica-side change we would end having the same problem
with upgrades that prevents us to apply the full solution to 3.x. That is, not-upgraded replicas
could send rows without the indexed-but-not-fetched columns to an upgraded coordinator that
would reject them. Complementary, upgraded replicas could send rows including the indexed-but-not-fetched
columns to not-upgraded coordinators that would return them without applying the row filter.
Probably we could also have problematic scenarios during reconciliation. 

Here is a fixed patch that just skips coordinator-side filtering of index results for Thrift
commands:

||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:82b122b1ce5b172e11b4be7f02fdb7581bd28291]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.11-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.11-dtest/]|
||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:1416d9b082d7f93b187cbf67abd9a917735c4804]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-trunk-dtest/]|

This means that not included index implementations couldn't benefit from our consistency fix
when using Thrift. Included index implementations weren't going to do so anyway because we
are not going to apply the replica side of the fix in 3.x.

No news for trunk.

What do you think? Is it acceptable to don't apply the changes to Thrift commands?

> 2ndary indexes can return stale data
> ------------------------------------
>
>                 Key: CASSANDRA-8272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica to return
a stale result and that result will be sent back to the user, potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before having applied
the insert, then the now stale result will be returned (since C will return it and A or B
will return nothing).
> A potential solution would be that when we read a tombstone in the index (and provided
we make the index inherit the gcGrace of it's parent CF), instead of skipping that tombstone,
we'd insert in the result a corresponding range tombstone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message