cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Brosius <dbros...@mebigfatguy.com>
Subject Re: Storing bi-temporal data in Cassandra
Date Sat, 14 Feb 2015 23:29:01 GMT
As you point out, there's not really a node-based problem with your 
query from a performance point of view. This is a limitation of CQL in 
that, cql wants to slice one section of a partition's row (no matter how 
big the section is). In your case, you are asking to slice multiple 
sections of a partition's row, which currently isn't supported.

It seems silly perhaps that this is the case, as certainly in your 
example it would seem useful, and not to difficult, but the problem is 
that you can wind up with n-depth slicing of that partitioned row given 
an arbitrary query syntax if range queries on clustering keys was 
allowed anywhere.

At present, you can either duplicate the data using the other clustering 
key (transaction_time) as primary clusterer for this use case, or omit 
the 3rd criterion (transaction_time = 'xxxx')in the query and get all 
the range query results and filter on the client.

hth,
dave


On 02/14/2015 06:05 PM, Raj N wrote:
> I don't think thats solves my problem. The question really is why 
> can't we use ranges for both time columns when they are part of the 
> primary key. They are on 1 row after all. Is this just a CQL limitation?
>
> -Raj
>
> On Sat, Feb 14, 2015 at 3:35 AM, DuyHai Doan <doanduyhai@gmail.com 
> <mailto:doanduyhai@gmail.com>> wrote:
>
>     "I am trying to get the state as of a particular transaction_time"
>
>      --> In that case you should probably define your primary key in
>     another order for clustering columns
>
>     PRIMARY KEY (weatherstation_id,transaction_time,event_time)
>
>     Then, select * from temperatures where weatherstation_id = 'foo'
>     and event_time >= '2015-01-01 00:00:00' and event_time <
>     '2015-01-02 00:00:00' and transaction_time = 'xxxx'
>
>
>
>     On Sat, Feb 14, 2015 at 3:06 AM, Raj N <raj.cassandra@gmail.com
>     <mailto:raj.cassandra@gmail.com>> wrote:
>
>         Has anyone designed a bi-temporal table in Cassandra? Doesn't
>         look like I can do this using CQL for now. Taking the time
>         series example from well known modeling tutorials in Cassandra -
>
>         CREATE TABLE temperatures (
>         weatherstation_id text,
>         event_time timestamp,
>         temperature text,
>         PRIMARY KEY (weatherstation_id,event_time),
>         ) WITH CLUSTERING ORDER BY (event_time DESC);
>
>         If I add another column transaction_time
>
>         CREATE TABLE temperatures (
>         weatherstation_id text,
>         event_time timestamp,
>         transaction_time timestamp,
>         temperature text,
>         PRIMARY KEY (weatherstation_id,event_time,transaction_time),
>         ) WITH CLUSTERING ORDER BY (event_time DESC, transaction_time
>         DESC);
>
>         If I try to run a query using the following CQL, it throws an
>         error -
>
>         select * from temperatures where weatherstation_id = 'foo' and
>         event_time >= '2015-01-01 00:00:00' and event_time <
>         '2015-01-02 00:00:00' and transaction_time < '2015-01-02 00:00:00'
>
>         It works if I use an equals clause for the event_time. I am
>         trying to get the state as of a particular transaction_time
>
>         -Raj
>
>
>


Mime
View raw message