cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6706) Duplicate rows returned when in clause has repeated values
Date Fri, 14 Feb 2014 13:20:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901413#comment-13901413
] 

Sylvain Lebresne commented on CASSANDRA-6706:
---------------------------------------------

That is kind of the intended behavior. Is it the best behavior? I don't know, though I'm not
sure it matters much in practice tbh. But when there is an IN, we do order the resulting rows
following the order of the values in the IN (unless there is an explicit ordering that takes
precedence of course) which kind of suggest we consider the IN values as a list rather than
a set, and from that perspective, it's probably not entirely crazy to return duplicate results
in that case. In particular, if you use a prepared marker for an IN, the server will expect
a list, not a set for the values (and changing now would really break users). It's easy enough
to avoid the duplication client side if you don't want duplicates.

Don't get me wrong, I'm not saying not returning duplicate in that case would be inferior,
but rather that I don't see a big problem with the current behavior and so that I'd rather
not introduce a breaking change, even a small one, for no good reason.

> Duplicate rows returned when in clause has repeated values
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-6706
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6706
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Found on 
> [cqlsh 4.1.0 | Cassandra 2.0.3-SNAPSHOT | CQL spec 3.0.0 | Thrift protocol 19.38.0]
>            Reporter: Gavin Casey
>
> If a value is repeated within an IN clause then repeated rows are returned for  the repeats:
> cqlsh> create table t1(c1 text primary key);
> cqlsh> insert into t1(c1) values ('A');
> cqlsh> select * from t1;
>  c1
> ----
>   A
> cqlsh> select * from t1 where c1 = 'A';
>  c1
> ----
>   A
> cqlsh> select * from t1 where c1 in( 'A');
>  c1
> ----
>   A
> cqlsh:dslog> select * from t1 where c1 in( 'A','A');
>  c1
> ----
>   A
>   A



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message