cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Redmumba <redmu...@gmail.com>
Subject Re: Cannot query secondary index
Date Mon, 09 Jun 2014 23:51:37 GMT
I've been trying to work around using "date-based tables" because I'd like
to avoid the overhead.  It seems, however, that this is just not going to
work.

So here's a question--for these date-based tables (i.e., a table per
day/week/month/whatever), how are they queried?  If I keep 60 days worth of
auditing data, for example, I'd need to query all 60 tables--can I do that
smoothly?  Or do I have to have 60 different select statements?  Is there a
way for me to run the same query against all the tables?


On Mon, Jun 9, 2014 at 3:42 PM, Redmumba <redmumba@gmail.com> wrote:

> Ah, so the secondary indices are really secondary against the primary
> key.  That makes sense.
>
> I'm beginning to see why the whole "date-based table" approach is the only
> one I've been able to find... thanks for the quick responses, guys!
>
>
> On Mon, Jun 9, 2014 at 2:45 PM, Michal Michalski <
> michal.michalski@boxever.com> wrote:
>
>> Secondary indexes internally are just CFs that map the indexed value to a
>> row key which that value belongs to, so you can only query these indexes
>> using "=", not ">", ">=" etc.
>>
>> However, your query does not require index *IF* you provide a row key -
>> you can use "<" or ">" like you did for the date column, as long as you
>> refer to a single row. However, if you don't provide it, it's not going to
>> work.
>>
>> M.
>>
>> Kind regards,
>> MichaƂ Michalski,
>> michal.michalski@boxever.com
>>
>>
>> On 9 June 2014 21:18, Redmumba <redmumba@gmail.com> wrote:
>>
>>> I have a table with a timestamp column on it; however, when I try to
>>> query based on it, it fails saying that I must use ALLOW FILTERING--which
>>> to me, means its not using the secondary index.  Table definition is
>>> (snipping out irrelevant parts)...
>>>
>>> CREATE TABLE audit (
>>>>     id bigint,
>>>>     date timestamp,
>>>> ...
>>>>     PRIMARY KEY (id, date)
>>>> );
>>>> CREATE INDEX date_idx ON audit (date);
>>>>
>>>
>>> There are other fields, but they are not relevant to this example.  The
>>> date is part of the primary key, and I have a secondary index on it.  When
>>> I run a SELECT against it, I get an error:
>>>
>>> cqlsh> SELECT * FROM asinauditing.asinaudit WHERE date < '2014-05-01';
>>>> Bad Request: Cannot execute this query as it might involve data
>>>> filtering and thus may have unpredictable performance. If you want to
>>>> execute this query despite the performance unpredictability, use ALLOW
>>>> FILTERING
>>>> cqlsh> SELECT * FROM asinauditing.asinaudit WHERE date < '2014-05-01'
>>>> ALLOW FILTERING;
>>>> Request did not complete within rpc_timeout.
>>>>
>>>
>>> How can I force it to use the index?  I've seen rebuild_index tasks
>>> running, but can I verify the "health" of the index?
>>>
>>
>>
>

Mime
View raw message