incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hannu Kröger <hkro...@gmail.com>
Subject Re: Bad Request: No indexed columns present in by-columns clause with Equal operator?
Date Mon, 04 Nov 2013 09:14:12 GMT
I tested the same and it seems to be so that you cannot such queries with
indexed columns. Probably you need to have at least one condition with
equal sign in the where clause. I am not sure.

You can achieve your goal by defining the primary key as follows:

create table test (
    employee_id text,
    employee_name text,
    value text,
    last_modified_date timeuuid,
    primary key (employee_id, last_modified_date)
   );

and then querying like this:
select * from test where last_modified_date > mintimeuuid('2013-11-03
13:33:30') and last_modified_date < maxtimeuuid('2013-11-05 13:33:45')
ALLOW FILTERING;

However, that will be slow because it has to do scanning. Therefore you
need to say "ALLOW FILTERING". Without that you will get a warning:
"Bad Request: Cannot execute this query as it might involve data filtering
and thus may have unpredictable performance. If you want to execute this
query despite the performance unpredictability, use ALLOW FILTERING"

The performance by using Cassandra like this is probably far from optimal.

Hannu




2013/11/3 Techy Teck <comptechgeeky@gmail.com>

> Thanks Hannu. I got your point.. But in my example `employee_id` won't be
> larger than `32767`.. So I am thinking of creating an index on these two
> columns -
>
>     create index employee_name_idx on test (employee_name);
>     create index last_modified_date_idx on test (last_modified_date);
>
> As the chances of executing the queries on above is very minimal.. Very
> rarely, we will be executing the above query but if we do, I wanted system
> to be capable of doing it.
>
> Now I can execute the below queries after creating an index -
>
>     select * from test where employee_name = 'e27';
>
>     select employee_id from test where employee_name = 'e27';
>     select * from test where employee_id = '1';
>
> But I cannot execute the below query which is - "Give me everything that
> has changed within 15 minutes" . So I wrote the below query like this -
>
>     select * from test where last_modified_date > mintimeuuid('2013-11-03
> 13:33:30') and last_modified_date < maxtimeuuid('2013-11-03 13:33:45');
>
> But it doesn't run and I always get error as  -
>
>     Bad Request: No indexed columns present in by-columns clause with
> Equal operator
>
>
> Any thoughts what wrong I am doing here?
>
>
>
> On Sun, Nov 3, 2013 at 12:43 PM, Hannu Kröger <hkroger@gmail.com> wrote:
>
>> Hi,
>>
>> You cannot query using a field that is not indexed in CQL. You have to
>> create either secondary index or create index tables and manage those
>> indexes by yourself and query using those. Since those keys are of high
>> cardinality, usually the recommendation for this kind of use cases is that
>> you create several tables with all the data.
>>
>> 1) A table with employee_id as the primary key.
>> 2) A table with last_modified_at as the primary key (use case 2)
>> 3) A table with employee_name as the primary key (your test query with
>> employee_name 'e27' and use cases 1 & 3.)
>>
>> Then you populate all those tables with your data and then you use those
>> tables depending on the query.
>>
>> Cheers,
>> Hannu
>>
>>
>>
>> 2013/11/3 Techy Teck <comptechgeeky@gmail.com>
>>
>>> I have below table in CQL-
>>>
>>> create table test (
>>>     employee_id text,
>>>     employee_name text,
>>>     value text,
>>>     last_modified_date timeuuid,
>>>     primary key (employee_id)
>>>    );
>>>
>>>
>>> I inserted couple of records in the above table like this which I will
>>> be inserting in our actual use case scenario as well-
>>>
>>>     insert into test (employee_id, employee_name, value,
>>> last_modified_date) values ('1', 'e27',  'some_value', now());
>>>     insert into test (employee_id, employee_name, value,
>>> last_modified_date) values ('2', 'e27',  'some_new_value', now());
>>>     insert into test (employee_id, employee_name, value,
>>> last_modified_date) values ('3', 'e27',  'some_again_value', now());
>>>     insert into test (employee_id, employee_name, value,
>>> last_modified_date) values ('4', 'e28',  'some_values', now());
>>>     insert into test (employee_id, employee_name, value,
>>> last_modified_date) values ('5', 'e28',  'some_new_values', now());
>>>
>>>
>>>
>>> Now I was doing select query for -  give me all the employee_id for
>>> employee_name `e27`.
>>>
>>>     select employee_id from test where employee_name = 'e27';
>>>
>>> And this is the error I am getting -
>>>
>>>     Bad Request: No indexed columns present in by-columns clause with
>>> Equal operator
>>>     Perhaps you meant to use CQL 2? Try using the -2 option when
>>> starting cqlsh.
>>>
>>>
>>> Is there anything wrong I am doing here?
>>>
>>> My use cases are in general -
>>>
>>>  1. Give me everything for any of the employee_name?
>>>  2. Give me everything for what has changed in last 5 minutes?
>>>  3. Give me the latest employee_id for any of the employee_name?
>>>
>>> I am running Cassandra 1.2.11
>>>
>>>
>>
>

Mime
View raw message