cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrés de la Peña (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-8272) 2ndary indexes can return stale data
Date Wed, 07 Jun 2017 11:21:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16039285#comment-16039285
] 

Andrés de la Peña edited comment on CASSANDRA-8272 at 6/7/17 11:21 AM:
-----------------------------------------------------------------------

Here is a new version of the patch for 3.11 and trunk:

||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:454617607063bfb554b841f0d891798404faf0b1]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.11-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-3.11-dtest/]|
||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:1416d9b082d7f93b187cbf67abd9a917735c4804]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-8272-trunk-dtest/]|

bq. No, I don't think we have to return all the rows not satisfying the index. I believe only
returning those that are before the {{n}} th "valid" entry is enough. I don't think it's different
from how we handle tombstones here: we don't return all tombstones, just the ones before the
{{n}} th live results.
bq. Note that both with those new "invalid" entries and with tombstones, it's possible that
post-resolution on the coordinator we end up being short on results. That is, a "valid" result
from A is canceled by a tombstone/"invalid" result of B and vice-versa and we end up with
less results than requested. But that's where the short-read protection from {{DataResolver}}
kicks in.

Indeed, short-read protection solves the problem, so I have left the {{DataLimits.Counter}}
as a stopping transformation. I have added [some dtests|https://github.com/adelapena/cassandra-dtest/blob/CASSANDRA-8272/secondary_indexes_test.py#L1205-L1343]
checking these scenarios with indexes.

bq. As an aside, had a very very quick scan of the patch, and I'll also note that in {{StorageProxy}}
and {{SinglePartitionReadCommand.Group.executeInternal}}, using only the post-processor of
the 1st command would break if the index actually makes assumption based on the command it's
passed on, so it feels dodgy and I think we sould make sure it's applied to each command result
individually.

Yes, it is dodgy. I have changed it to apply the post processing to each command in the group.


As we said, the patch for 3.11 only contains the changes in the coordinator side. I have added
[a test|https://github.com/adelapena/cassandra/blob/8272-3.11/test/unit/org/apache/cassandra/index/CustomIndexTest.java#L804-L871]
in {{CustomIndexTest}} that uses [a custom index implementation|https://github.com/adelapena/cassandra/blob/8272-3.11/test/unit/org/apache/cassandra/index/CustomIndexTest.java#L1180-L1256]
to validate coordinator side filtering.

The patch for trunk also modifies regular secondary indexes to send stale rows. SASI don't
uses the mechanism because of the aforementioned problem with expressions evaluation and text
analysis, I think we should fix this in a separate ticket.

Please let me know what do you think.


was (Author: adelapena):
Here is a new version of the patch for 3.11 and trunk:

||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:454617607063bfb554b841f0d891798404faf0b1]||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:1416d9b082d7f93b187cbf67abd9a917735c4804]||[dtests|https://github.com/riptano/cassandra-dtest/compare/master...adelapena:CASSANDRA-8272]||

Unfortunately cassci service is not working right now, I'll add the test results as soon as
it comes back to live.

bq. No, I don't think we have to return all the rows not satisfying the index. I believe only
returning those that are before the {{n}} th "valid" entry is enough. I don't think it's different
from how we handle tombstones here: we don't return all tombstones, just the ones before the
{{n}} th live results.
bq. Note that both with those new "invalid" entries and with tombstones, it's possible that
post-resolution on the coordinator we end up being short on results. That is, a "valid" result
from A is canceled by a tombstone/"invalid" result of B and vice-versa and we end up with
less results than requested. But that's where the short-read protection from {{DataResolver}}
kicks in.

Indeed, short-read protection solves the problem, so I have left the {{DataLimits.Counter}}
as a stopping transformation. I have added [some dtests|https://github.com/adelapena/cassandra-dtest/blob/CASSANDRA-8272/secondary_indexes_test.py#L1205-L1343]
checking these scenarios with indexes.

bq. As an aside, had a very very quick scan of the patch, and I'll also note that in {{StorageProxy}}
and {{SinglePartitionReadCommand.Group.executeInternal}}, using only the post-processor of
the 1st command would break if the index actually makes assumption based on the command it's
passed on, so it feels dodgy and I think we sould make sure it's applied to each command result
individually.

Yes, it is dodgy. I have changed it to apply the post processing to each command in the group.


As we said, the patch for 3.11 only contains the changes in the coordinator side. I have added
[a test|https://github.com/adelapena/cassandra/blob/8272-3.11/test/unit/org/apache/cassandra/index/CustomIndexTest.java#L804-L871]
in {{CustomIndexTest}} that uses [a custom index implementation|https://github.com/adelapena/cassandra/blob/8272-3.11/test/unit/org/apache/cassandra/index/CustomIndexTest.java#L1180-L1256]
to validate coordinator side filtering.

The patch for trunk also modifies regular secondary indexes to send stale rows. SASI don't
uses the mechanism because of the aforementioned problem with expressions evaluation and text
analysis, I think we should fix this in a separate ticket.

Please let me know what do you think.

> 2ndary indexes can return stale data
> ------------------------------------
>
>                 Key: CASSANDRA-8272
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>            Assignee: Andrés de la Peña
>             Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica to return
a stale result and that result will be sent back to the user, potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before having applied
the insert, then the now stale result will be returned (since C will return it and A or B
will return nothing).
> A potential solution would be that when we read a tombstone in the index (and provided
we make the index inherit the gcGrace of it's parent CF), instead of skipping that tombstone,
we'd insert in the result a corresponding range tombstone.  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message