cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8273) Allow filtering queries can return stale data
Date Fri, 12 May 2017 11:05:05 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007964#comment-16007964
] 

Sylvain Lebresne commented on CASSANDRA-8273:
---------------------------------------------

bq. Obviously, moving the filtering to the coordinator would remove that problem, but doing
so would, on top of not being trivial to implmenent, have serious performance impact since
we can't know in advance how much data will be filtered and we may have to redo query to replica
multiple times. 

That comment (from the description) is pretty old and isn't entirely accurate anymore so I
want to amend it and expand on it.

While it's obviously still true that moving filtering coordinator-side has performance impacts,
it's now kind of trivial to do post-CASSANDRA-8099.

Basically, I believe we just need to move the {{RowFilter#filter}} call that is currently
in {{ReadCommand#executeLocally()}} to post-coordinator-reconciliation. Typically, to the
{{postReconciliationProcessing()}} method that {{PartitionRangeReadCommand}} has that we would
just generalize to all {{ReadCommand}} (that is, adding it to {{SinglePartitionReadCommand}}).

In particular, while it's still true that we'll have to redo queries when filtering makes
us fall short on a first try, the "short read protection" from {{DataResolver}} actually handles
this for us reasonably nicely.

Of course, there is the performance concerns, which concretely come in 2 flavors:
# we'll transfer everything that is filtered from the replica to the coordinator while we
don't today.
# as a consequence and as mentioned above, we'll have to (usually) do multiple coordinator<->replica
queries to get a particular count of final rows, when it's only one today.

I do want to note the following though:
* For CL.ONE, and as noted by Robert above, this is not really a big deal. There is actually
no impact if you use a token-aware client. If you don't, then we could theoretically push
the filtering on the replica in that specific case, but honestly, if you care about performance,
you should be using token-awareness so I'm not convinced it's even worth adding any complexity
for this (at the very least, for a v1, we don't currently ship the CL with queries to replica,
and while I'm sure we'll want to change that for other reasons at some point, I don't think
we should bother here).
* For higher CL, it's definitively a bigger impact, but here the thing: if you use a higher
CL, that implies that you actually care about and _rely on_ CL guarantees, so I think no kind
of performance matters if we don't fulfill those guarantees, and not fixing a know correctness
issue because it impact performance is imo backward.

I'll also note that while the 2nd flavor will certainly have an impact, the short-read protection
from {{DataResolver}} is actually not too stupid about this and will "regulate" his 2nd query
based on how much was filtered on the 1st one to limit the impact somehow. Not awesome, but
better than nothing.

Anyway, I'm personally in favor of fixing this by moving filtering server-side, as while this
has performance impact, we shouldn't be fast at the expense of correctness. And I have no
clue how to fix this replica-side and no-one offered a proper option for that in ~3 years.
Let's we make things correct now, and _then_ we can think about how to optimize.

I also do want to remind for context that {{ALLOW FILTERING}} is something we strongly advertise
as not-a-great-idea for anything performance sensitive in the first place, so that's imo all
the more reason to not agonize over performance too much and favor correctness first and foremost.

> Allow filtering queries can return stale data
> ---------------------------------------------
>
>                 Key: CASSANDRA-8273
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8273
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sylvain Lebresne
>
> Data filtering is done replica side. That means that a single replica with stale data
may make the whole query return that stale data.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v1 text, v2 int);
> CREATE INDEX ON test(v1);
> INSERT INTO test(k, v1, v2) VALUES (0, 'foo', 1);
> {noformat}
> with every replica up to date. Now, suppose that the following queries are done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v2 = 2 WHERE k = 0;
> SELECT * FROM test WHERE v1 = 'foo' AND v2 = 1;
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before having applied
the insert, then the now stale result will be returned. Let's note that this is a problem
related to filtering, not 2ndary indexes.
> This issue share similarity with CASSANDRA-8272 but contrarily to that former issue,
I'm not sure how to fix it. Obviously, moving the filtering to the coordinator would remove
that problem, but doing so would, on top of not being trivial to implmenent, have serious
performance impact since we can't know in advance how much data will be filtered and we may
have to redo query to replica multiple times.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message