incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jens Rantil" <jens.ran...@tink.se>
Subject Re: How to perform Range Queries in Cassandra
Date Sun, 06 Jul 2014 08:33:16 GMT
Ramirez,


If you partition your data correctly speed will be ~proportional. But there's always an upper
limit - a slow range query that executes on a single node (using cluster key) will always
be a slow.




Cheers,

Jens
—
Sent from Mailbox

On Sun, Jul 6, 2014 at 8:04 AM, Rameez Thonnakkal <ssrameez@gmail.com>
wrote:

> Won't the performeance improve significantly if you increase the number of
> nodes even in a commodity hardware profile.
> On 5 Jul 2014 01:38, "Jens Rantil" <jens.rantil@tink.se> wrote:
>> Hi Mike,
>>
>> To learn get subsecond performance on your queries using _any_ database
>> you need to use proper indexing. Like Jeremy said, Solr will do this.
>>
>> If you'd like to try to solve this using Cassandra you need to learn the
>> difference between partition and clustering in your primary key and
>> understand you need a clustering to do any kind of range query.
>>
>> Also, COUNTs in Cassandra are generally fairly slow.
>>
>> Cheers,
>> Jens
>> —
>> Sent from Mailbox <https://www.dropbox.com/mailbox>
>>
>>
>> On Tue, Jun 24, 2014 at 10:09 AM, Mike Carter <jalooser2@gmail.com> wrote:
>>
>>> Hello!
>>>
>>>
>>> I'm a beginner in C* and I'm quite struggling with it.
>>>
>>> I’d like to measure the performance of some Cassandra-Range-Queries. The
>>> idea is to execute multidimensional range-queries on Cassandra. E.g. there
>>> is a given table of 1million rows with 10 columns and I like to execute
>>> some queries like “select count(*) from testable where d=1 and v1<10 and
v2
>>> >20 and v3 <45 and v4>70 … allow filtering”.  This kind of queries
is very
>>> slow in C* and soon the tables are bigger, I get a read-timeout probably
>>> caused by long scan operations.
>>>
>>> In further tests I like to extend the dimensions to more than 200
>>> hundreds and the rows to 100millions, but actually I can’t handle this
>>> small table. Should reorganize the data or is it impossible to perform such
>>> high multi-dimensional queries on Cassandra?
>>>
>>>
>>>
>>>
>>>
>>> The setup:
>>>
>>> Cassandra is installed on a single node with 2 TB disk space and 180GB
>>> Ram.
>>>
>>> Connected to Test Cluster at localhost:9160.
>>>
>>> [cqlsh 4.1.1 | Cassandra 2.0.7 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
>>>
>>>
>>>
>>> Keyspace:
>>>
>>> CREATE KEYSPACE test WITH replication = {
>>>
>>>   'class': 'SimpleStrategy',
>>>
>>>   'replication_factor': '1'
>>>
>>> };
>>>
>>>
>>>
>>>
>>>
>>> Table:
>>>
>>> CREATE TABLE testc21 (
>>>
>>>   key int,
>>>
>>>   d int,
>>>
>>>   v1 int,
>>>
>>>   v10 int,
>>>
>>>   v2 int,
>>>
>>>   v3 int,
>>>
>>>   v4 int,
>>>
>>>   v5 int,
>>>
>>>   v6 int,
>>>
>>>   v7 int,
>>>
>>>   v8 int,
>>>
>>>   v9 int,
>>>
>>>   PRIMARY KEY (key)
>>>
>>> ) WITH
>>>
>>>   bloom_filter_fp_chance=0.010000 AND
>>>
>>>   caching='ROWS_ONLY' AND
>>>
>>>   comment='' AND
>>>
>>>   dclocal_read_repair_chance=0.000000 AND
>>>
>>>   gc_grace_seconds=864000 AND
>>>
>>>   index_interval=128 AND
>>>
>>>   read_repair_chance=0.100000 AND
>>>
>>>   replicate_on_write='true' AND
>>>
>>>   populate_io_cache_on_flush='false' AND
>>>
>>>   default_time_to_live=0 AND
>>>
>>>   speculative_retry='99.0PERCENTILE' AND
>>>
>>>   memtable_flush_period_in_ms=0 AND
>>>
>>>   compaction={'class': 'SizeTieredCompactionStrategy'} AND
>>>
>>>   compression={'sstable_compression': 'LZ4Compressor'};
>>>
>>>
>>>
>>> CREATE INDEX testc21_d_idx ON testc21 (d);
>>>
>>>
>>>
>>>  select * from testc21 limit 10;
>>>
>>> key    | d | v1 | v10 | v2 | v3 | v4  | v5 | v6 | v7 | v8 | v9
>>>
>>> --------+---+----+-----+----+----+-----+----+----+----+----+-----
>>>
>>>  302602 | 1 | 56 |  55 | 26 | 45 |  67 | 75 | 25 | 50 | 26 |  54
>>>
>>>  531141 | 1 | 90 |  77 | 86 | 42 |  76 | 91 | 47 | 31 | 77 |  27
>>>
>>>  693077 | 1 | 67 |  71 | 14 | 59 | 100 | 90 | 11 | 15 |  6 |  19
>>>
>>>    4317 | 1 | 70 |  77 | 44 | 77 |  41 | 68 | 33 |  0 | 99 |  14
>>>
>>>  927961 | 1 | 15 |  97 | 95 | 80 |  35 | 36 | 45 |  8 | 11 | 100
>>>
>>>  313395 | 1 | 68 |  62 | 56 | 85 |  14 | 96 | 43 |  6 | 32 |   7
>>>
>>>  368168 | 1 |  3 |  63 | 55 | 32 |  18 | 95 | 67 | 78 | 83 |  52
>>>
>>>  671830 | 1 | 14 |  29 | 28 | 17 |  42 | 42 |  4 |  6 | 61 |  93
>>>
>>>   62693 | 1 | 26 |  48 | 15 | 22 |  73 | 94 | 86 |  4 | 66 |  63
>>>
>>>  488360 | 1 |  8 |  57 | 86 | 31 |  51 |  9 | 40 | 52 | 91 |  45
>>>
>>> Mike
>>>
>>
>>
Mime
View raw message