incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: timeuuid and cql3 query
Date Wed, 19 Jun 2013 18:15:29 GMT
So part of it is a bug, namely
https://issues.apache.org/jira/browse/CASSANDRA-5666. In summary CQL3
should not accept: ts > minTimeuuid('2013-06-17 22:36:16') and ts <
minTimeuuid('2013-06-20 22:44:02'), because it does no know how to handle
it properly. What it should support is token(ts) >
token(minTimeuuid('2013-06-17 22:36:16')) and token(ts) <
token(minTimeuuid('2013-06-20 22:44:02')). And that is different because
the token always sort by bytes, and comparing timeuuid by bytes does not
yield a time based ordering.

Long story short, using non-equal condition on the partition key (i.e. the
first part of your primary key) is generally not advised. Or to put it
another way, the use of the byte ordering partitioner is discouraged. But
if you still want to use the ordering partitioner and do range queries on
the partition key, do not use a timeuuid, because the ordering that the
partitioner enforce will not be one that is meaningful (due to the timeuuid
layout).

--
Sylvain



On Wed, Jun 19, 2013 at 7:04 PM, Ryan, Brent <BRyan@cvent.com> wrote:

>  Note that it seems to work when you structure your schema in this
> example below, BUT this is a problem because all of my data will wind up
> hitting a single node in my cassandra cluster because the partitioning key
> is "counter" and that isn't unique enough.  I was hoping that I wasn't
> going to need to build up my own "sharding" scheme as this blog talks about
> (http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra)
> because this becomes much harder for other clients to integrate with
> because they now need to know how my data is structured in order to get it
> out.
>
>  CREATE TABLE count5 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (counter, ts)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
>
>  cqlsh:Test> select counter,dateof(ts),key1,value from count5 where
> counter = 'test' and ts > minTimeuuid('2013-06-17 22:36:16') and ts <
> minTimeuuid('2013-06-18 22:44:02');
>
>   counter | dateof(ts)               | key1 | value
> ---------+--------------------------+------+-------
>     test | 2013-06-18 22:43:53-0400 |    1 |     1
>     test | 2013-06-18 22:43:54-0400 |    1 |     1
>     test | 2013-06-18 22:43:55-0400 |    1 |     1
>     test | 2013-06-18 22:43:56-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:59-0400 |    1 |     1
>     test | 2013-06-18 22:44:00-0400 |    1 |     1
>     test | 2013-06-18 22:44:01-0400 |    1 |     1
>
>  cqlsh:Test> select counter,dateof(ts),key1,value from count5 where
> counter = 'test' and ts > minTimeuuid('2013-06-17 22:36:16') and ts <
> minTimeuuid('2013-06-20 22:44:02');
>
>   counter | dateof(ts)               | key1 | value
> ---------+--------------------------+------+-------
>     test | 2013-06-18 22:43:53-0400 |    1 |     1
>     test | 2013-06-18 22:43:54-0400 |    1 |     1
>     test | 2013-06-18 22:43:55-0400 |    1 |     1
>     test | 2013-06-18 22:43:56-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:58-0400 |    1 |     1
>     test | 2013-06-18 22:43:59-0400 |    1 |     1
>     test | 2013-06-18 22:44:00-0400 |    1 |     1
>     test | 2013-06-18 22:44:01-0400 |    1 |     1
>     test | 2013-06-18 22:44:02-0400 |    1 |     1
>     test | 2013-06-18 22:44:02-0400 |    1 |     1
>     test | 2013-06-18 22:44:03-0400 |    1 |     1
>     test | 2013-06-18 22:44:04-0400 |    1 |     1
>     test | 2013-06-18 22:44:05-0400 |    1 |     1
>     test | 2013-06-18 22:44:06-0400 |    1 |     1
>
>
>   From: <Ryan>, Brent Ryan <bryan@cvent.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 12:56 PM
>
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>   Here's an example of that not working:
>
>  cqlsh:Test> desc table count4;
>
>  CREATE TABLE count4 (
>   ts timeuuid,
>   counter text,
>   key1 text,
>   value int,
>   PRIMARY KEY (ts, counter)
> ) WITH
>   bloom_filter_fp_chance=0.010000 AND
>   caching='KEYS_ONLY' AND
>   comment='' AND
>   dclocal_read_repair_chance=0.000000 AND
>   gc_grace_seconds=864000 AND
>   read_repair_chance=0.100000 AND
>   replicate_on_write='true' AND
>   populate_io_cache_on_flush='false' AND
>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>   compression={'sstable_compression': 'SnappyCompressor'};
>
>  cqlsh:Test> select counter,dateof(ts),key1,value from count4;
>
>   counter | dateof(ts)               | key1 | value
> ---------+--------------------------+------+-------
>     test | 2013-06-18 22:36:16-0400 |    1 |     1
>     test | 2013-06-18 22:36:18-0400 |    1 |     1
>     test | 2013-06-18 22:36:18-0400 |    1 |     1
>     test | 2013-06-18 22:36:18-0400 |    1 |     1
>     test | 2013-06-18 22:36:19-0400 |    1 |     1
>     test | 2013-06-18 22:36:19-0400 |    1 |     1
>     test | 2013-06-18 22:36:20-0400 |    1 |     1
>     test | 2013-06-18 22:36:20-0400 |    1 |     1
>     test | 2013-06-18 22:36:21-0400 |    1 |     1
>     test | 2013-06-18 22:36:21-0400 |    1 |     1
>     test | 2013-06-18 22:36:22-0400 |    1 |     1
>     test | 2013-06-18 22:36:22-0400 |    1 |     1
>     test | 2013-06-18 22:36:23-0400 |    1 |     1
>     test | 2013-06-18 22:36:23-0400 |    1 |     1
>     test | 2013-06-18 22:36:25-0400 |    1 |     1
>     test | 2013-06-18 22:36:27-0400 |    1 |     1
>     test | 2013-06-18 22:36:28-0400 |    1 |     1
>
>  cqlsh:Statistics> select counter,dateof(ts),key1,value from count4 where
> ts > minTimeuuid('2013-06-17 22:36:16') and ts < minTimeuuid('2013-06-19
> 22:36:20');
> Bad Request: 2 Start key must sort before (or equal to) finish key in your
> partitioner!
>
>
>
>  Any ideas?  Seems like a bug to me, right?
>
>  Brent
>
>   From: <Ryan>, Brent Ryan <bryan@cvent.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 12:47 PM
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>   Tyler,
>
>  You're recommending this schema instead, correct?
>
>  CREATE TABLE count3 (
>   counter text,
>   ts timeuuid,
>   key1 text,
>   value int,
>   PRIMARY KEY (ts, counter)
> )
>
>  I believe I tried this as well and ran into similar problems but I'll
> try it again.  I'm using the "ByteOrderedPartitioner" if that helps with
> the latest version of DSE community edition which I believe is Cassandra
> 1.2.3.
>
>
>  Thanks,
> Brent
>
>
>   From: Tyler Hobbs <tyler@datastax.com>
> Reply-To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Date: Wednesday, June 19, 2013 11:00 AM
> To: "user@cassandra.apache.org" <user@cassandra.apache.org>
> Subject: Re: timeuuid and cql3 query
>
>
> On Wed, Jun 19, 2013 at 8:08 AM, Ryan, Brent <BRyan@cvent.com> wrote:
>
>>
>>  CREATE TABLE count3 (
>>   counter text,
>>   ts timeuuid,
>>   key1 text,
>>   value int,
>>   PRIMARY KEY ((counter, ts))
>> )
>>
>
> Instead of doing a composite partition key, remove a set of parens and let
> ts be your clustering key.  That will cause cql rows to be stored in sorted
> order by the ts column (for a given value of "counter") and allow you to do
> the kind of query you're looking for.
>
>
> --
> Tyler Hobbs
> DataStax <http://datastax.com/>
>

Mime
View raw message