cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edouard COLE <>
Subject RE: Duplicated key with an IN statement
Date Thu, 04 Feb 2016 14:48:01 GMT

When running that kind of query with TRACING ON; I noticed the coordinator is also performing
multiple time the same query

Because the element in the IN statement can involve many nodes, it makes sense to map/reduce
the query, but running multiple time the same sub query should not happen. What if the result
set change? Let’s imagine that query : SELECT * FROM t WHERE key IN (123, 123, …. X1000,
123), and while this query runs, the data for 123 change?

key | value
123 |   456
123 |   456
 123 |   456
 123 |   789 <-- Change here ☹
123 |   789

There’s also something very important: when your table define a tuple being unique for a
specific key, this is a real problem to be able to have a result set having multiple time
the same key, which should be unique. This is why on every SQL implementation, this is not

I think this is a bug

Edouard COLE

De : Alain RODRIGUEZ []
Envoyé : Thursday, February 04, 2016 11:55 AM
À : Edouard COLE
Cc :
Objet : Re: Duplicated key with an IN statement


This is interesting.

It seems rational that if you are looking at 2 keys and both exist (which is the case) it
returns you 2 keys, it. Yet, I just checked this kind of command on MySQL and it gives a one
line result. So here CQL differs from SQL (at least MySQL). I know we are trying to fit as
much as possible with SQL to avoid loosing people, so we might want to change this.
Not sure if this behavior is intentional / known. Not even sure someone ever tried to do this
kind of query actually :).

Does anyone know about that ? Should we raise a ticket ?

Alain Rodriguez

The Last Pickle

2016-02-04 8:36 GMT+00:00 Edouard COLE <<>>:

I just discovered this, and I think this is weird:

ed@debian:~$ cqlsh
Connected to _CLUSTER_ at<>.
[cqlsh 4.0.1 | Cassandra | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh> USE ks-test ;
cqlsh:ks-test> CREATE TABLE t (
            ...     key int,
            ...     value int,
            ...     PRIMARY KEY (key)
            ... );
cqlsh:ks-test> INSERT INTO t (key, value) VALUES (123, 456) ;
cqlsh:ks-test> SELECT * FROM t ;

 key | value
 123 |   456

(1 rows)

cqlsh:ks-test> SELECT * FROM t WHERE key IN (123, 123);

 key | value
 123 |   456
 123 |   456 <----- WTF?

(2 rows)

Adding multiple time the same key into an IN statement make the query returns multiple time
the tuple

This looks weird to me, can anyone give me some feedback on such a behavior?

Edouard COLE

View raw message