cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Secondary index and/or row key in the read path ?
Date Tue, 21 Aug 2012 08:48:33 GMT
> - do we need to post-process (filter) the result of the query in our application ?
Thats the one :)

Right now the code paths don't exist to select a row using a row key *and* apply a column
level filter. The RPC API does not work that way and I'm not sure if this is something that
is planned for CQL. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 20/08/2012, at 6:33 PM, Jean-Armel Luce <jaluce06@gmail.com> wrote:

> 
> Hello,
> 
> I am using Cassandra 1.1.1 and CQL3.
> 
> Could you tell me what is the best strategy for retrieving a row using a condition on
a row key (operator =) and also filter on a 2nd column?
> 
> For example, I create a  table named "testwhere" with a row key on column "mykey" and
2 other columns "col1" and "col2".
>   
> I would like to retrieve the row with the key 'key1' only if col1 = 'abcd'  
> I send the request  SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 =
'abcd';
> As you can see, the 1st condition in the WHERE clause is based on the row key.
> However the request doesn't work if no secondary index is created on the column used
in the 2nd condition of the WHERE clause. It works only if a secondary indexed is created
on this 2nd column (see below). 
> Does that mean that the secondary index is used in the read path instead of the row key,
even if there is a condition on the row key in the WHERE clause ?
> 
> Here is an example :
> 
> jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3
> Connected to Test Cluster at localhost:9160.
> [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
> Use HELP for help.
> cqlsh> use test1;
> cqlsh:test1> CREATE TABLE testwhere (mykey varchar PRIMARY KEY,
>          ...  col1 varchar,
>          ...  col2 varchar);
> cqlsh:test1> INSERT INTO testwhere (mykey, col1, col2) VALUES ('key1', 'abcd', 'efgh');
> 
> cqlsh:test1>  SELECT mykey, col1 from testwhere where mykey = 'key1';
>  mykey | col1
> -------+------
>   key1 | abcd
> 
> cqlsh:test1>  SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd';
> Bad Request: No indexed columns present in by-columns clause with Equal operator
> 
> cqlsh:test1> CREATE INDEX col1_idx ON testwhere (col1);
> cqlsh:test1>  SELECT mykey, col1 from testwhere where mykey = 'key1' and col1 = 'abcd';
>  mykey | col1
> -------+------
>   key1 | abcd
> 
> cqlsh:test1>
> 
> 
> My understanding is :
> The 1st SELECT is working because there is only the row key in the WHERE clause
> The 2nd SELECT is not working because the row key is in the WHERE clause, but there is
no index on col1
> The 3rd SELECT (which is the same as the 2nd SELECT) is working because the row key is
in the WHERE clause, and a secondary index is created on col1
> 
> 
> For this use case, what are the recommendations of the Cassandra community ?
> - do we need to create a secondary index for each column we want to filter ?
> - do we need to post-process (filter) the result of the query in our application ?
> - or is there another solution ?
> 
> 
> Thanks.
> 
> Jean-Armel


Mime
View raw message