cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13595) Short read protection doesn't work at the end of a partition
Date Fri, 22 Sep 2017 16:24:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16176664#comment-16176664
] 

Aleksey Yeschenko commented on CASSANDRA-13595:
-----------------------------------------------

Ok first cut, but there are some major issues with it.

1. You can't rely on the class of the command to determine if it's a single partition or a
range read command. We do use range read commands for indexed queries even when the partition
key is set. See CASSANDRA-11617 and CASSANDRA-11872 for some more context.

2. You cannot use {{command.limits().forShortReadRetry()}} method for the new limits here.
It wasn't written with ranged reads in mind, and does among other things throw away the per
partition limit - not what you want to happen.

3. The command created doesn't get passed original {{command.isForThrift()}} and hardcodes
it as {{false}}. This was an issue with row-level SRP as well, but I fixed it yesterday. Should
be using {{PartitionRangeReadCommand.withUpdatedLimitsAndDataRange()}} instead. As a bonus,
it preserves the correct {{indexMetadata}} so you don't have to do an extra lookup.

4. I don't think that new range calculation is correct, and accounts for collisions of multiple
partitions keys mapping to tokens.

5. {{shouldDoPartitionShortReadProtection()}} can be written a lot simpler, and some is redundant.
{{if (mergedResultCounter.counted() >= command.limits().count())}} can't ever be true (but
also is equivalent to {{if (mergedResultCounter.isDone())}}). The only meaningful thing you
can do here is
{code}
            // if the returned partition doesn't have enough partitions/rows to satisfy even
the original limit, don't ask for more
            if (!singleResultCounter.isDone())
                return null;
{code}
, to be honest.

6. I'm not a huge fan of the way {{expectedRows}} is shared between two protections, and am
not sure it's correct.

7. The metric for SRP requests isn't being incremented.

I have a version of this that fixes most of these. Needs some more work and manual testing,
and eventually approval. It's a bit urgent for me, so do you mind if I take it over from this
point? Thanks.

> Short read protection doesn't work at the end of a partition
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-13595
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13595
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Coordination
>            Reporter: Andrés de la Peña
>            Assignee: ZhaoYang
>              Labels: Correctness
>
> It seems that short read protection doesn't work when the short read is done at the end
of a partition in a range query. The final assertion of this dtest fails:
> {code}
> def short_read_partitions_delete_test(self):
>         cluster = self.cluster
>         cluster.set_configuration_options(values={'hinted_handoff_enabled': False})
>         cluster.set_batch_commitlog(enabled=True)
>         cluster.populate(2).start(wait_other_notice=True)
>         node1, node2 = self.cluster.nodelist()
>         session = self.patient_cql_connection(node1)
>         create_ks(session, 'ks', 2)
>         session.execute("CREATE TABLE t (k int, c int, PRIMARY KEY(k, c)) WITH read_repair_chance
= 0.0")
>         # we write 1 and 2 in a partition: all nodes get it.
>         session.execute(SimpleStatement("INSERT INTO t (k, c) VALUES (1, 1)", consistency_level=ConsistencyLevel.ALL))
>         session.execute(SimpleStatement("INSERT INTO t (k, c) VALUES (2, 1)", consistency_level=ConsistencyLevel.ALL))
>         # we delete partition 1: only node 1 gets it.
>         node2.flush()
>         node2.stop(wait_other_notice=True)
>         session = self.patient_cql_connection(node1, 'ks', consistency_level=ConsistencyLevel.ONE)
>         session.execute(SimpleStatement("DELETE FROM t WHERE k = 1"))
>         node2.start(wait_other_notice=True)
>         # we delete partition 2: only node 2 gets it.
>         node1.flush()
>         node1.stop(wait_other_notice=True)
>         session = self.patient_cql_connection(node2, 'ks', consistency_level=ConsistencyLevel.ONE)
>         session.execute(SimpleStatement("DELETE FROM t WHERE k = 2"))
>         node1.start(wait_other_notice=True)
>         # read from both nodes
>         session = self.patient_cql_connection(node1, 'ks', consistency_level=ConsistencyLevel.ALL)
>         assert_none(session, "SELECT * FROM t LIMIT 1")
> {code}
> However, the dtest passes if we remove the {{LIMIT 1}}.
> Short read protection [uses a {{SinglePartitionReadCommand}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/DataResolver.java#L484],
maybe it should use a {{PartitionRangeReadCommand}} instead?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message