cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8087) Multiple non-DISTINCT rows returned when page_size set
Date Thu, 18 Dec 2014 16:27:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14251871#comment-14251871
] 

Sylvain Lebresne commented on CASSANDRA-8087:
---------------------------------------------

bq. I believe this logic is safe, but I'm not 100% sure.

I think it works, but can be simplified. That is, I don't really think the fact that the table
has some statics plays a role in deciding if it's a distinct or not. However, I believe we
can just have {{countCQL3Rows}} be
{noformat}
public boolean countCQL3Rows()
{
    return ((SliceQueryFilter)predicate).count != 1;
}
{noformat}
because:
* unless it's a {{DISTINCT}} query, the slice filter count is the {{LIMIT}} of the query
* we don't modify that count in that code path. We do update the slice count in {{SliceQueryPager}}
but not in {{RangeSliceQueryPager}}. Not sure why though, and I think that can make us fetch
more than we should so it might be kind of a bug.
* we don't page queries in the first place if their {{LIMIT <= pageSize}} and so we'll
never page a query with a limit of 1.
* it follows that only distinct can have 1 for the slice count in that method.


> Multiple non-DISTINCT rows returned when page_size set
> ------------------------------------------------------
>
>                 Key: CASSANDRA-8087
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8087
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Adam Holmberg
>            Assignee: Tyler Hobbs
>            Priority: Minor
>             Fix For: 2.0.12
>
>         Attachments: 8087-2.0.txt
>
>
> Using the following statements to reproduce:
> {code}
> CREATE TABLE test (
>                 k int,
>                 p int,
>                 s int static,
>                 PRIMARY KEY (k, p)
>             );
> INSERT INTO test (k, p) VALUES (1, 1);
> INSERT INTO test (k, p) VALUES (1, 2);
> SELECT DISTINCT k, s FROM test ;
> {code}
> Native clients that set result_page_size in the query message receive multiple non-distinct
rows back (one per clustered value p in row k).
> This is only reproduced on 2.0.10. Does not appear in 2.1.0
> It does not appear in cqlsh for 2.0.10 because thrift.
> See https://datastax-oss.atlassian.net/browse/PYTHON-164 for background



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message