cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11223) Queries with LIMIT filtering on clustering columns can return less rows than expected
Date Thu, 20 Jul 2017 09:13:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094408#comment-16094408
] 

Stefania commented on CASSANDRA-11223:
--------------------------------------

I don't think it's correct to always return false in [ClusteringIndexNamesFilter.selectsAllPartition()|https://github.com/stef1927/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/filter/ClusteringIndexNamesFilter.java#L75].


It's existing code, but with this patch applied we are no longer able to count rows for tables
of the form {{CREATE TABLE %s (k int, v int, PRIMARY KEY (k) ) WITH COMPACT STORAGE}}. We
don't notice in the tests because we trim the results in {{SelectStatement}}, but it does
mean that we return too much data replica side in this cases. I noticed because of timeouts
with large range queries on tables created by cassandra-stress.

Here is a [test|https://github.com/apache/cassandra/compare/trunk...stef1927:11223-3.0] for
3.0 that reproduces the problem: 

{code}
 @Test
    public void testLimitInStaticTable() throws Throwable
    {
        createTable("CREATE TABLE %s (k int, v int, PRIMARY KEY (k) ) WITH COMPACT STORAGE
");

        for (int i = 0; i < 10; i++)
            execute("INSERT INTO %s(k, v) VALUES (?, ?)", i, i);

        assertRows(execute("SELECT * FROM %s LIMIT 5"),
                   row(0, 0),
                   row(1, 1),
                   row(2, 2),
                   row(3, 3),
                   row(4, 4));
    }
{code}

If we temporarily comment out {{cqlRows.trim(userLimit);}} in {{SelectStatement.process()}},
then the test only passes if we return {{clusterings.isEmpty()}} from {{ClusteringIndexNamesFilter.selectsAllPartition}}.
However, note that I am not 100% sure this approach is correct.

Once you are back from holiday, could you take a look [~blerer]?

> Queries with LIMIT filtering on clustering columns can return less rows than expected
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-11223
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11223
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Local Write-Read Paths
>            Reporter: Benjamin Lerer
>            Assignee: Benjamin Lerer
>             Fix For: 2.2.11, 3.0.15, 3.11.1, 4.0
>
>
> A query like {{SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW FILTERING}} can return less
row than expected if the table has some static columns and some of the partition have no rows
matching b = 1.
> The problem can be reproduced with the following unit test:
> {code}
>     public void testFilteringOnClusteringColumnsWithLimitAndStaticColumns() throws
Throwable
>     {
>         createTable("CREATE TABLE %s (a int, b int, s int static, c int,
primary key (a, b))");
>         for (int i = 0; i < 3; i++)
>         {
>             execute("INSERT INTO %s (a, s) VALUES (?, ?)", i,
i);
>                 for (int j = 0; j < 3; j++)
>                     if (!(i == 0 && j
== 1))
>                         execute("INSERT
INTO %s (a, b, c) VALUES (?, ?, ?)", i, j, i + j);
>         }
>         assertRows(execute("SELECT * FROM %s"),
>                    row(1, 0, 1, 1),
>                    row(1, 1, 1, 2),
>                    row(1, 2, 1, 3),
>                    row(0, 0, 0, 0),
>                    row(0, 2, 0, 2),
>                    row(2, 0, 2, 2),
>                    row(2, 1, 2, 3),
>                    row(2, 2, 2, 4));
>         assertRows(execute("SELECT * FROM %s WHERE b = 1 ALLOW FILTERING"),
>                    row(1, 1, 1, 2),
>                    row(2, 1, 2, 3));
>         assertRows(execute("SELECT * FROM %s WHERE b = 1 LIMIT 2 ALLOW
FILTERING"),
>                    row(1, 1, 1, 2),
>                    row(2, 1, 2, 3)); // <--------
FAIL It returns only one row because the static row of partition 0 is counted and filtered
out in SELECT statement
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message