Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EFE65D774 for ; Tue, 21 Aug 2012 09:15:23 +0000 (UTC) Received: (qmail 21099 invoked by uid 500); 21 Aug 2012 09:15:21 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 21033 invoked by uid 500); 21 Aug 2012 09:15:21 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 21013 invoked by uid 99); 21 Aug 2012 09:15:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Aug 2012 09:15:21 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jaluce06@gmail.com designates 209.85.210.172 as permitted sender) Received: from [209.85.210.172] (HELO mail-iy0-f172.google.com) (209.85.210.172) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Aug 2012 09:15:14 +0000 Received: by iabz21 with SMTP id z21so3848868iab.31 for ; Tue, 21 Aug 2012 02:14:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=ol56gl+hl78k/0qoB1CJy4OLxC6+sbm6pPXdZd38nx8=; b=D3pzx2hFLWencmaq1YYGJ4xUo1GtwZ3wLrYxWVvxg4drSKlEQg1V/a6Tpa0qO/oSMw o3fv8JkSDCdtr7j/ntTtRl8yx7LzuYf0wDw6eHzxQS+nn/xbyeD+QDaX18XMsW8PoyWO qBPKD1HkDs6QfGnrIAb6F2O4EPfvotDnYT/HYpcgVxdf1dQb9EwmGgYnqL1boPs0P6BK yhzYDQtEI7s5HrTeZN0t+ASS/2BJBnNwHvsYur8BIgES7KSJ6A6CIar+Q62/G3vHNMVI Mp8q+RTAm9i6sQNdvVgCq7px0j0+sBT0Da1cyTuYA9R6UOHmUwDTFvDqJw5aUpYwANZM lqww== MIME-Version: 1.0 Received: by 10.50.76.202 with SMTP id m10mr12590279igw.52.1345540493019; Tue, 21 Aug 2012 02:14:53 -0700 (PDT) Received: by 10.64.63.4 with HTTP; Tue, 21 Aug 2012 02:14:52 -0700 (PDT) In-Reply-To: References: Date: Tue, 21 Aug 2012 11:14:52 +0200 Message-ID: Subject: Re: Secondary index and/or row key in the read path ? From: Jean-Armel Luce To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=e89a8f2343b51e093d04c7c30f31 --e89a8f2343b51e093d04c7c30f31 Content-Type: text/plain; charset=ISO-8859-1 Hi Aaron, Thank you for your answer. So, I shall do post-processing for selecting a row using a row key *and* applying a column level filter. Best Regards, Jean-Armel 2012/8/21 aaron morton > - do we need to post-process (filter) the result of the query in our > application ? > > Thats the one :) > > Right now the code paths don't exist to select a row using a row key *and* > apply a column level filter. The RPC API does not work that way and I'm not > sure if this is something that is planned for CQL. > > Cheers > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 20/08/2012, at 6:33 PM, Jean-Armel Luce wrote: > > > Hello, > > I am using Cassandra 1.1.1 and CQL3. > > Could you tell me what is the best strategy for retrieving a row using a > condition on a row key (operator =) and also filter on a 2nd column? > > For example, I create a table named "testwhere" with a row key on column > "mykey" and 2 other columns "col1" and "col2". > > I would like to retrieve the row with the key 'key1' only if col1 = > 'abcd' > I send the request SELECT mykey, col1 from testwhere where mykey = 'key1' > and col1 = 'abcd'; > As you can see, the 1st condition in the WHERE clause is based on the row > key. > However the request doesn't work if no secondary index is created on the > column used in the 2nd condition of the WHERE clause. It works only if a > secondary indexed is created on this 2nd column (see below). > Does that mean that the secondary index is used in the read path instead > of the row key, even if there is a condition on the row key in the WHERE > clause ? > > Here is an example : > > jal@jal-VirtualBox:~/cassandra/apache-cassandra-1.1.1/bin$ ./cqlsh -3 > Connected to Test Cluster at localhost:9160. > [cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.0] > Use HELP for help. > cqlsh> use test1; > cqlsh:test1> CREATE TABLE testwhere (mykey varchar PRIMARY KEY, > ... col1 varchar, > ... col2 varchar); > cqlsh:test1> INSERT INTO testwhere (mykey, col1, col2) VALUES ('key1', > 'abcd', 'efgh'); > > cqlsh:test1> SELECT mykey, col1 from testwhere where mykey = 'key1'; > mykey | col1 > -------+------ > key1 | abcd > > cqlsh:test1> SELECT mykey, col1 from testwhere where mykey = 'key1' and > col1 = 'abcd'; > Bad Request: No indexed columns present in by-columns clause with Equal > operator > > cqlsh:test1> CREATE INDEX col1_idx ON testwhere (col1); > cqlsh:test1> SELECT mykey, col1 from testwhere where mykey = 'key1' and > col1 = 'abcd'; > mykey | col1 > -------+------ > key1 | abcd > > cqlsh:test1> > > > My understanding is : > The 1st SELECT is working because there is only the row key in the WHERE > clause > The 2nd SELECT is not working because the row key is in the WHERE clause, > but there is no index on col1 > The 3rd SELECT (which is the same as the 2nd SELECT) is working because > the row key is in the WHERE clause, and a secondary index is created on col1 > > > For this use case, what are the recommendations of the Cassandra community > ? > - do we need to create a secondary index for each column we want to filter > ? > - do we need to post-process (filter) the result of the query in our > application ? > - or is there another solution ? > > > Thanks. > > Jean-Armel > > > --e89a8f2343b51e093d04c7c30f31 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Aaron,

Thank you for your answer.

So, I shall do post-proc= essing for selecting a row using a row key *and* applying a column level fi= lter.

Best Regards,
Jean-Armel

2012/8/21 aaron morton <aaron@thelastpickle.com>
- do we need to post-process (filter) the result of the query in our a= pplication ?
Thats the one :)

Righ= t now the code paths don't exist to select a row using a row key *and* = apply a column level filter. The RPC API does not work that way and I'm= not sure if this is something that is planned for CQL.=A0

Cheers

<= div style=3D"word-wrap:break-word">
-----------------
Aaron Morton
Freelance Deve= loper
@aaronmorton

On 20/08/2012, at 6:33 PM, Jean-Armel Luce <jaluce06@gmail.com> wrote:=


Hello,

I am using Cassandra = 1.1.1 and CQL3.

Could you tell me what is the best strategy for retrieving a row using = a condition on a row key (operator =3D) and also filter on a 2nd column?
For example, I create a=A0 table named "testwhere" with a ro= w key on column "mykey" and 2 other columns "col1" and = "col2".
=A0
I would like to retrieve the row with the key 'key1' only i= f col1 =3D 'abcd'=A0
I send the request=A0 SELECT mykey, col1 f= rom testwhere where mykey =3D 'key1' and col1 =3D 'abcd';As you can see, the 1st condition in the WHERE clause is based on the row= key.
However the request doesn't work if no secondary index is created on th= e column used in the 2nd condition of the WHERE clause. It works only if a = secondary indexed is created on this 2nd column (see below).
Does that = mean that the secondary index is used in the read path instead of the row k= ey, even if there is a condition on the row key in the WHERE clause ?

Here is an example :

jal@jal-VirtualBox:~/cassandra/apache-cassa= ndra-1.1.1/bin$ ./cqlsh -3
Connected to Test Cluster at localhost:9160.<= br>[cqlsh 2.2.0 | Cassandra 1.1.1 | CQL spec 3.0.0 | Thrift protocol 19.32.= 0]
Use HELP for help.
cqlsh> use test1;
cqlsh:test1> CREATE TABLE = testwhere (mykey varchar PRIMARY KEY,
=A0=A0=A0=A0=A0=A0=A0=A0 ...=A0 co= l1 varchar,
=A0=A0=A0=A0=A0=A0=A0=A0 ...=A0 col2 varchar);
cqlsh:test= 1> INSERT INTO testwhere (mykey, col1, col2) VALUES ('key1', = 9;abcd', 'efgh');

cqlsh:test1>=A0 SELECT mykey, col1 from testwhere where mykey =3D &#= 39;key1';
=A0mykey | col1
-------+------
=A0 key1 | abcd
cqlsh:test1>=A0 SELECT mykey, col1 from testwhere where mykey =3D '= ;key1' and col1 =3D 'abcd';
Bad Request: No indexed columns present in by-columns clause with Equal ope= rator

cqlsh:test1> CREATE INDEX col1_idx ON testwhere (col1);
= cqlsh:test1>=A0 SELECT mykey, col1 from testwhere where mykey =3D 'k= ey1' and col1 =3D 'abcd';
=A0mykey | col1
-------+------
=A0 key1 | abcd

cqlsh:test1>=


My understanding is :
The 1st SELECT is working because ther= e is only the row key in the WHERE clause
The 2nd SELECT is not working = because the row key is in the WHERE clause, but there is no index on col1 The 3rd SELECT (which is the same as the 2nd SELECT) is working because the= row key is in the WHERE clause, and a secondary index is created on col1

For this use case, what are the recommendations of the Cassandra = community ?
- do we need to create a secondary index for each column we want to filter = ?
- do we need to post-process (filter) the result of the query in our a= pplication ?
- or is there another solution ?


Thanks.

Jean-Armel


--e89a8f2343b51e093d04c7c30f31--