Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C414010416 for ; Mon, 4 Nov 2013 09:14:49 +0000 (UTC) Received: (qmail 26133 invoked by uid 500); 4 Nov 2013 09:14:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 26076 invoked by uid 500); 4 Nov 2013 09:14:41 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 26068 invoked by uid 99); 4 Nov 2013 09:14:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Nov 2013 09:14:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hkroger@gmail.com designates 209.85.215.54 as permitted sender) Received: from [209.85.215.54] (HELO mail-la0-f54.google.com) (209.85.215.54) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Nov 2013 09:14:33 +0000 Received: by mail-la0-f54.google.com with SMTP id n7so2027745lam.41 for ; Mon, 04 Nov 2013 01:14:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=CxHjneRCRQG7dHdnMQb76I0I2R601aenOZrfCCCbV4g=; b=U/uvsmWONnKwA84WmSIZt9EtGVPBJXlo0OKzneE66t7sX1SeI2WUJF++KDjnpKuhdy +KHXk7+BYr2TyWuTMRHZTyB/gxa9Ofz8FINDzSPr1gCREUpr/RslWVpKxBZIgI4gkaCo 2/QI0mmblqhppKzmRUt61O2Qm/kbcJ32nKNRY2F0VNAMqoWUp01rnf6HHbhxn7JsMsdA eO/vGOZnCEWub2eFx+oN/CWTTsNu4EwZ626V/h520qXDhbNuyuEd7MNg6CO9st3h6/pw O4zAK3eqnniYpT6g6302iOSNxtv+dpJokEhh/RZrsClx9EiWqXDOlNoneynDhtAdviiR SdXQ== MIME-Version: 1.0 X-Received: by 10.152.10.99 with SMTP id h3mr11288384lab.13.1383556452124; Mon, 04 Nov 2013 01:14:12 -0800 (PST) Received: by 10.112.166.38 with HTTP; Mon, 4 Nov 2013 01:14:12 -0800 (PST) In-Reply-To: References: Date: Mon, 4 Nov 2013 11:14:12 +0200 Message-ID: Subject: Re: Bad Request: No indexed columns present in by-columns clause with Equal operator? From: =?ISO-8859-1?Q?Hannu_Kr=F6ger?= To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=001a1132f26adb0eb304ea5656c9 X-Virus-Checked: Checked by ClamAV on apache.org --001a1132f26adb0eb304ea5656c9 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I tested the same and it seems to be so that you cannot such queries with indexed columns. Probably you need to have at least one condition with equal sign in the where clause. I am not sure. You can achieve your goal by defining the primary key as follows: create table test ( employee_id text, employee_name text, value text, last_modified_date timeuuid, primary key (employee_id, last_modified_date) ); and then querying like this: select * from test where last_modified_date > mintimeuuid('2013-11-03 13:33:30') and last_modified_date < maxtimeuuid('2013-11-05 13:33:45') ALLOW FILTERING; However, that will be slow because it has to do scanning. Therefore you need to say "ALLOW FILTERING". Without that you will get a warning: "Bad Request: Cannot execute this query as it might involve data filtering and thus may have unpredictable performance. If you want to execute this query despite the performance unpredictability, use ALLOW FILTERING" The performance by using Cassandra like this is probably far from optimal. Hannu 2013/11/3 Techy Teck > Thanks Hannu. I got your point.. But in my example `employee_id` won't be > larger than `32767`.. So I am thinking of creating an index on these two > columns - > > create index employee_name_idx on test (employee_name); > create index last_modified_date_idx on test (last_modified_date); > > As the chances of executing the queries on above is very minimal.. Very > rarely, we will be executing the above query but if we do, I wanted syste= m > to be capable of doing it. > > Now I can execute the below queries after creating an index - > > select * from test where employee_name =3D 'e27'; > > select employee_id from test where employee_name =3D 'e27'; > select * from test where employee_id =3D '1'; > > But I cannot execute the below query which is - "Give me everything that > has changed within 15 minutes" . So I wrote the below query like this - > > select * from test where last_modified_date > mintimeuuid('2013-11-03 > 13:33:30') and last_modified_date < maxtimeuuid('2013-11-03 13:33:45'); > > But it doesn't run and I always get error as - > > Bad Request: No indexed columns present in by-columns clause with > Equal operator > > > Any thoughts what wrong I am doing here? > > > > On Sun, Nov 3, 2013 at 12:43 PM, Hannu Kr=F6ger wrote= : > >> Hi, >> >> You cannot query using a field that is not indexed in CQL. You have to >> create either secondary index or create index tables and manage those >> indexes by yourself and query using those. Since those keys are of high >> cardinality, usually the recommendation for this kind of use cases is th= at >> you create several tables with all the data. >> >> 1) A table with employee_id as the primary key. >> 2) A table with last_modified_at as the primary key (use case 2) >> 3) A table with employee_name as the primary key (your test query with >> employee_name 'e27' and use cases 1 & 3.) >> >> Then you populate all those tables with your data and then you use those >> tables depending on the query. >> >> Cheers, >> Hannu >> >> >> >> 2013/11/3 Techy Teck >> >>> I have below table in CQL- >>> >>> create table test ( >>> employee_id text, >>> employee_name text, >>> value text, >>> last_modified_date timeuuid, >>> primary key (employee_id) >>> ); >>> >>> >>> I inserted couple of records in the above table like this which I will >>> be inserting in our actual use case scenario as well- >>> >>> insert into test (employee_id, employee_name, value, >>> last_modified_date) values ('1', 'e27', 'some_value', now()); >>> insert into test (employee_id, employee_name, value, >>> last_modified_date) values ('2', 'e27', 'some_new_value', now()); >>> insert into test (employee_id, employee_name, value, >>> last_modified_date) values ('3', 'e27', 'some_again_value', now()); >>> insert into test (employee_id, employee_name, value, >>> last_modified_date) values ('4', 'e28', 'some_values', now()); >>> insert into test (employee_id, employee_name, value, >>> last_modified_date) values ('5', 'e28', 'some_new_values', now()); >>> >>> >>> >>> Now I was doing select query for - give me all the employee_id for >>> employee_name `e27`. >>> >>> select employee_id from test where employee_name =3D 'e27'; >>> >>> And this is the error I am getting - >>> >>> Bad Request: No indexed columns present in by-columns clause with >>> Equal operator >>> Perhaps you meant to use CQL 2? Try using the -2 option when >>> starting cqlsh. >>> >>> >>> Is there anything wrong I am doing here? >>> >>> My use cases are in general - >>> >>> 1. Give me everything for any of the employee_name? >>> 2. Give me everything for what has changed in last 5 minutes? >>> 3. Give me the latest employee_id for any of the employee_name? >>> >>> I am running Cassandra 1.2.11 >>> >>> >> > --001a1132f26adb0eb304ea5656c9 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
I tested the same and it seems to be so that you cannot su= ch queries with indexed columns. Probably you need to have at least one con= dition with equal sign in the where clause. I am not sure.

You can achieve your goal by defining the primary key as follows:=A0

create table test (
= =A0=A0=A0 employee_id text,
=A0=A0=A0 emplo= yee_name text,
=A0=A0=A0 v= alue text,
=A0=A0=A0 last_= modified_date timeuuid,
= =A0=A0=A0 primary key (employee_id, last_modified_date)
=A0=A0 );

and then querying like this:
select * from test where last_modifie= d_date > mintimeuuid('2013-11-03 13:33:30') and last_modified_da= te < maxtimeuuid('2013-11-05 13:33:45') ALLOW FILTERING;<= br>

However, that will be slow because it has to do sca= nning. Therefore you need to say "ALLOW FILTERING". Without that = you will get a warning:
"Bad Request: Cannot execute thi= s query as it might involve data filtering and thus may have unpredictable = performance. If you want to execute this query despite the performance unpr= edictability, use ALLOW FILTERING"

The performance by using Cassandra like this is probabl= y far from optimal.

Hannu




2013/11= /3 Techy Teck <comptechgeeky@gmail.com>
Thanks Hannu. I got your point.. But in my example `e= mployee_id` won't be larger than `32767`.. So I am thinking of creating= an index on these two columns -

=A0=A0=A0 create index employee_na= me_idx on test (employee_name);
=A0=A0=A0 create index last_modified_date_idx on test (last_modified_date);=

As the chances of executing the queries on above is very minimal.. = Very rarely, we will be executing the above query but if we do, I wanted sy= stem to be capable of doing it.

Now I can execute the below queries after creating an index -

= =A0=A0=A0 select * from test where employee_name =3D 'e27';

=A0=A0=A0 select employee_id from test where employee_name = =3D 'e27';
=A0=A0=A0 select * from test where employee_id =3D '1';
=A0=A0=A0
But I cannot execute the below query which is - "Give me= everything that has changed within 15 minutes" . So I wrote the below= query like this -

=A0=A0=A0 select * from test where last_modified= _date > mintimeuuid('2013-11-03 13:33:30') and last_modified_dat= e < maxtimeuuid('2013-11-03 13:33:45');

But it doesn't run and I always get error as=A0 -

=A0=A0=A0 Bad Request: No indexed columns present in by-columns cl= ause with Equal operator


Any thoughts what wrong I a= m doing here?



On Sun, Nov 3, 2013 at 12:43 PM, Hannu K= r=F6ger <hkroger@gmail.com> wrote:
Hi,
=
You cannot query using a field that is not indexed in CQL. Y= ou have to create either secondary index or create index tables and manage = those indexes by yourself and query using those. Since those keys are of hi= gh cardinality, usually the recommendation for this kind of use cases is th= at you create several tables with all the data.

1) A table with employee_id as the primary key.
2) A table with last_modified_at as the primary key (use case 2)
3) A table with employee_name as the primary key (your test query with e= mployee_name 'e27' and use cases 1 & 3.)

Then you populate all those tables with your data and t= hen you use those tables depending on the query.

C= heers,
Hannu
=A0


2013/11/3 Techy Teck &= lt;comptechgee= ky@gmail.com>
I have below table in CQL-

create table test (
= =A0=A0=A0 employee_id text,
=A0=A0=A0 employee_name text,
=A0=A0=A0 v= alue text,
=A0=A0=A0 last_modified_date timeuuid,
=A0=A0=A0 primary k= ey (employee_id)
=A0=A0 );
=A0=A0
=A0=A0
I inserted couple of records in the above table like this which = I will be inserting in our actual use case scenario as well-

=A0=A0= =A0 insert into test (employee_id, employee_name, value, last_modified_date= ) values ('1', 'e27',=A0 'some_value', now());
=A0=A0=A0 insert into test (employee_id, employee_name, value, last_modifie= d_date) values ('2', 'e27',=A0 'some_new_value', no= w());
=A0=A0=A0 insert into test (employee_id, employee_name, value, las= t_modified_date) values ('3', 'e27',=A0 'some_again_val= ue', now());
=A0=A0=A0 insert into test (employee_id, employee_name, value, last_modifie= d_date) values ('4', 'e28',=A0 'some_values', now()= );
=A0=A0=A0 insert into test (employee_id, employee_name, value, last_m= odified_date) values ('5', 'e28',=A0 'some_new_values&#= 39;, now());

=A0=A0=A0
=A0=A0=A0
Now I was doing select query for -=A0 give = me all the employee_id for employee_name `e27`.

=A0=A0=A0 select emp= loyee_id from test where employee_name =3D 'e27';
=A0=A0=A0
= And this is the error I am getting -

=A0=A0=A0 Bad Request: No indexed columns present in by-columns clause = with Equal operator
=A0=A0=A0 Perhaps you meant to use CQL 2? Try using = the -2 option when starting cqlsh.

=A0=A0
Is there anything wron= g I am doing here?

My use cases are in general -

=A01. Give me everything for any = of the employee_name?
=A02. Give me everything for what has changed in = last 5 minutes?
=A03. Give me the latest employee_id for any of the emp= loyee_name?

I am running Cassandra 1.2.11




--001a1132f26adb0eb304ea5656c9--