cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Allen <matthew.j.al...@gmail.com>
Subject Re: Multi-column range scans
Date Mon, 14 Jul 2014 23:58:41 GMT
Thanks for both your help, greatly appreciated.

We'll proceed down the path of putting the filtering into the application
logic for the time being.

Matt.


On Tue, Jul 15, 2014 at 1:20 AM, DuyHai Doan <doanduyhai@gmail.com> wrote:

> Exact Ken, I get bitten again by the semantics of composite tuples.
>
>  This kind of query won't be possible until something like wide row end
> slice predicate is available (
> https://issues.apache.org/jira/browse/CASSANDRA-6167), if it will one day
>
>
>
>
> On Mon, Jul 14, 2014 at 5:02 PM, Ken Hancock <ken.hancock@schange.com>
> wrote:
>
>> I don't think your query is doing what he wants.  Your query will
>> correctly set the starting point, but will also return larger interval_id's
>> but with lower skill_levels:
>>
>> cqlsh:test> select * from skill_count where skill='Complaints' and
>> (interval_id, skill_level) >= (1402359300000, 5);
>>
>>  skill      | interval_id   | skill_level | skill_count
>> ------------+---------------+-------------+-------------
>>  Complaints | 1402359300000 |           5 |          20
>>  Complaints | 1402359300000 |           8 |          30
>>  Complaints | 1402359300000 |          10 |           1
>>  Complaints | 1402359400000 |           2 |          10
>>  Complaints | 1402359400000 |           8 |          30
>>
>> (5 rows)
>>
>> cqlsh:test> select * from skill_count where skill='Complaints' and
>> (interval_id, skill_level) >= (1402359300000, 5) and (interval_id) <
>> (1402359900000);
>>
>>  skill      | interval_id   | skill_level | skill_count
>> ------------+---------------+-------------+-------------
>>  Complaints | 1402359300000 |           5 |          20  <- desired
>>  Complaints | 1402359300000 |           8 |          30  <- desired
>>  Complaints | 1402359300000 |          10 |           1  <- desired
>>  Complaints | 1402359400000 |           2 |          10  <- SKIP
>>  Complaints | 1402359400000 |           8 |          30  <- desired
>>
>> The query results in a discontinuous range slice so isn't supported --
>> Essentially, the client will have to read the entire range and perform
>> client-side filtering.  Whether this is efficient depends on the
>> cardinality of skill_level.
>>
>> I tried playing with the "allow filtering" cql clause, but it would
>> appear from the documentation it's very restrictive...
>>
>>
>>
>>
>>
>> On Mon, Jul 14, 2014 at 7:44 AM, DuyHai Doan <doanduyhai@gmail.com>
>> wrote:
>>
>>> or :
>>>
>>>
>>> select * from skill_count where skill='Complaints'
>>> and (interval_id,skill_level) >= (1402359300000,5)
>>> and (interval_id) < (1402359900000)
>>>
>>> Strange enough, when starting using tuple notation you'll need to stick
>>> to it even if there is only one element in the tuple
>>>
>>>
>>> On Mon, Jul 14, 2014 at 1:40 PM, DuyHai Doan <doanduyhai@gmail.com>
>>> wrote:
>>>
>>>> Sorry, I've just checked, the correct query should be:
>>>>
>>>> select * from skill_count where skill='Complaints' and
>>>> (interval_id,skill_level) >= (1402359300000,5) and
>>>> (interval_id,skill_level) < (1402359900000,11)
>>>>
>>>>
>>>> On Mon, Jul 14, 2014 at 9:45 AM, DuyHai Doan <doanduyhai@gmail.com>
>>>> wrote:
>>>>
>>>>> Hello Mathew
>>>>>
>>>>>  Since Cassandra 2.0.6 it is possible to query over composites:
>>>>> https://issues.apache.org/jira/browse/CASSANDRA-4851
>>>>>
>>>>> For your example:
>>>>>
>>>>> select * from skill_count where skill='Complaints' and
>>>>> (interval_id,skill_level) >= (1402359300000,5) and interval_id <
>>>>> 1402359900000;
>>>>>
>>>>>
>>>>> On Mon, Jul 14, 2014 at 6:09 AM, Matthew Allen <
>>>>> matthew.j.allen@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> We have a roll-up table that as follows.
>>>>>>
>>>>>> CREATE TABLE SKILL_COUNT (
>>>>>>   skill text,
>>>>>>   interval_id bigint,
>>>>>>   skill_level int,
>>>>>>   skill_count int,
>>>>>>   PRIMARY KEY (skill, interval_id, skill_level));
>>>>>>
>>>>>> Essentially,
>>>>>>   skill = a names skill i.e. "Complaints"
>>>>>>   interval_id = a rounded epoch time (15 minute intervals)
>>>>>>   skill_level = a number/rating from 1-10
>>>>>>   skill_count = the number of people with the specified skill, with
>>>>>> the specified skill level, logged in at the interval_id
>>>>>>
>>>>>> We'd like to run the following query against it
>>>>>>
>>>>>> select * from skill_count where skill='Complaints' and interval_id
>=
>>>>>> 1402359300000 and interval_id < 1402359900000 and skill_level
>= 5;
>>>>>>
>>>>>> to get a count of people with the relevant skill and level at the
>>>>>> appropriate time.  However I am getting the following message.
>>>>>>
>>>>>> Bad Request: PRIMARY KEY part skill_level cannot be restricted
>>>>>> (preceding part interval_id is either not restricted or by a non-EQ
>>>>>> relation)
>>>>>>
>>>>>> Looking at how the data is stored ...
>>>>>>
>>>>>> -------------------
>>>>>> RowKey: Complaints
>>>>>> => (name=1402359300000:2:, value=, timestamp=1405308260403000)
>>>>>> => (name=1402359300000:2:skill_count, value=0000000a,
>>>>>> timestamp=1405308260403000)
>>>>>> => (name=1402359300000:5:, value=, timestamp=1405308260403001)
>>>>>> => (name=1402359300000:5:skill_count, value=00000014,
>>>>>> timestamp=1405308260403001)
>>>>>> => (name=1402359300000:8:, value=, timestamp=1405308260419000)
>>>>>> => (name=1402359300000:8:skill_count, value=0000001e,
>>>>>> timestamp=1405308260419000)
>>>>>> => (name=1402359300000:10:, value=, timestamp=1405308260419001)
>>>>>> => (name=1402359300000:10:skill_count, value=00000001,
>>>>>> timestamp=1405308260419001)
>>>>>>
>>>>>> Should cassandra be able to allow for an extra level of filtering
?
>>>>>> or is this something that should be performed from within the application.
>>>>>>
>>>>>> We have a solution working in Oracle, but would like to store this
>>>>>> data in Cassandra, as all the other data that this solution relies
on
>>>>>> already sits within Cassandra.
>>>>>>
>>>>>> Appreciate any guidance on this matter.
>>>>>>
>>>>>> Matt
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> *Ken Hancock *| System Architect, Advanced Advertising
>> SeaChange International
>> 50 Nagog Park
>> Acton, Massachusetts 01720
>> ken.hancock@schange.com | www.schange.com | NASDAQ:SEAC
>> <http://www.schange.com/en-US/Company/InvestorRelations.aspx>
>> Office: +1 (978) 889-3329 | [image: Google Talk:] ken.hancock@schange.com
>>  | [image: Skype:]hancockks | [image: Yahoo IM:]hancockks [image:
>> LinkedIn] <http://www.linkedin.com/in/kenhancock>
>>
>> [image: SeaChange International]
>>  <http://www.schange.com/>This e-mail and any attachments may contain
>> information which is SeaChange International confidential. The information
>> enclosed is intended only for the addressees herein and may not be copied
>> or forwarded without permission from SeaChange International.
>>
>
>

Mime
View raw message