cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Lacefield <jlacefi...@datastax.com>
Subject Re: Cannot query secondary index
Date Mon, 09 Jun 2014 21:32:47 GMT
Hello,

  You are receiving this item because you are not passing in the Partition
Key as part of your query.  Cassandra is telling you it doesn't know which
node to find the data and you haven't explicitly told it to search across
all your nodes for the data.  The ALLOW FILTERING clause bypasses the need
to pass in a partition key in your query.
http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html

  Big picture, for data modeling in Cassandra, it's advisable to model your
data based on the query access patterns and to duplicate data into tables
that represent your query.  In this case, creating a table with a Partition
Key of date, could benefit you.  Heavy use of ALLOW FILTERING could cause
performance issues within your cluster.

  Also, please be aware that Secondary Indexes are much different in
Cassandra-land compared to indexes in RDBMS-land.  They should be used only
when necessary, i.e. an explicit use case.  Typically, modeling your data
so you can avoid Secondary Indexes will ensure a well preforming system and
queries.

  Here's a good intro to Cassandra data modeling:
https://www.youtube.com/watch?v=HdJlsOZVGwM

  Hope this helps.

Jonathan

Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
<http://www.linkedin.com/in/jlacefield>

<http://www.datastax.com/cassandrasummit14>



On Mon, Jun 9, 2014 at 5:18 PM, Redmumba <redmumba@gmail.com> wrote:

> I have a table with a timestamp column on it; however, when I try to query
> based on it, it fails saying that I must use ALLOW FILTERING--which to me,
> means its not using the secondary index.  Table definition is (snipping out
> irrelevant parts)...
>
> CREATE TABLE audit (
>>     id bigint,
>>     date timestamp,
>> ...
>>     PRIMARY KEY (id, date)
>> );
>> CREATE INDEX date_idx ON audit (date);
>>
>
> There are other fields, but they are not relevant to this example.  The
> date is part of the primary key, and I have a secondary index on it.  When
> I run a SELECT against it, I get an error:
>
> cqlsh> SELECT * FROM asinauditing.asinaudit WHERE date < '2014-05-01';
>> Bad Request: Cannot execute this query as it might involve data filtering
>> and thus may have unpredictable performance. If you want to execute this
>> query despite the performance unpredictability, use ALLOW FILTERING
>> cqlsh> SELECT * FROM asinauditing.asinaudit WHERE date < '2014-05-01'
>> ALLOW FILTERING;
>> Request did not complete within rpc_timeout.
>>
>
> How can I force it to use the index?  I've seen rebuild_index tasks
> running, but can I verify the "health" of the index?
>

Mime
View raw message