cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anuj Wadehra <anujw_2...@yahoo.co.in>
Subject Re: Range scans
Date Thu, 19 Nov 2015 17:16:28 GMT
Chandra,


I feel that you are trying to implement time series data pattern using secondary index and
one column per row. I think a much better solution would be to partition data based on some
logical row key ..if thats not availabke it may be hour/day + a bucket id (too prevent hot
spots).. You should use timeuuid as clustering key and you will be able to query data based
on clustering key (time) most efficiently..


To understand data modelling of time series data using clustering key, Please visit https://academy.datastax.com/demos/getting-started-time-series-data-modeling


And


http://intellidzine.blogspot.in/2014/01/cassandra-data-modelling-primary-keys.html?m=1


Thanks

Anuj


Sent from Yahoo Mail on Android

From:"Anuj Wadehra" <anujw_2003@yahoo.co.in>
Date:Thu, 19 Nov, 2015 at 5:31 pm
Subject:Re: Range scans

Hi Chandra,


I will comment on some points. Someone else can take remaining ones:


1. Secondary Index are only useful when data returned by the index query is in hundreds. Fetching
large data using secondary index would be very slow. Secondary indexes dont scale well.


2.token query should be of form:


Where token(key) > «some token»



You are again applying token function on right side of comparison operator and thats why 

getting unexpected results.



Thanks

Anuj



Sent from Yahoo Mail on Android

From:"Chandra Sekar KR" <chandrasekarkr@hotmail.com>
Date:Thu, 19 Nov, 2015 at 3:16 pm
Subject:Range scans

Hi,


I would like to run a range scan on timestamp column b with secondary indexes without passing
the partition key. I'm aware that Cassandra does not support range scans on secondary indexes
unless one more column (primary/secondary index) clause with an = operator is supplied.


CREATE TABLE test (
    a timeuuid PRIMARY KEY,
    b timestamp,
    c varint
);
CREATE INDEX indx_b ON test (b);


INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000000);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000001);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000002);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000003);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000004);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000005);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000006);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000007);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000008);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000009);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000010);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000011);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000012);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000013);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000014);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000015);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000016);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000017);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000018);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000019);

INSERT INTO TEST(a,b,c) VALUES (now(), unixTimestampOf(now()), 1000020);


Also, is there any alternate way of running range scans on column a using TOKEN function,
similar to below. I tried running the same, but was getting unexpected results.


SELECT * FROM test 

  WHERE TOKEN(a) >= TOKEN(3b84a5b0-8e8d-11e5-b494-c9d29cfa4efd) 

  AND TOKEN(a) <= TOKEN(3b8c94f0-8e8d-11e5-b494-c9d29cfa4efd);


Regards, Chandra Sekar KR


Mime
View raw message