cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Check Peck <comptechge...@gmail.com>
Subject Re: Cassandra Data Model design
Date Wed, 17 Sep 2014 22:35:32 GMT
It takes around more than 50 seconds to return back 500 records from cqlsh
command not from the code so that's why I am saying it is pretty slow.

On Wed, Sep 17, 2014 at 3:17 PM, Hao Cheng <bryan@critica.io> wrote:

> How slow is slow? Regardless of the data model question, in my experience
> 500 rows of relatively light content should be lightning fast. Looking at
> my performance results on a test cluster of 3x r3.large AWS instances, we
> reach an op rate on Cassandra's stress test of at least 1000 operations per
> second and on average 7500 operations for second over the stress test data
> set.
>
> More broadly, it seems like you would benefit from either deltas (only
> retrieve new data) or something like paging (only retrieve currently
> relevant data), although its really difficult to say without more
> information.
>
> On Wed, Sep 17, 2014 at 1:01 PM, Check Peck <comptechgeeky@gmail.com>
> wrote:
>
>> I have recently started working with Cassandra. We have cassandra cluster
>> which is using DSE 4.0 version and has VNODES enabled. We have a tables
>> like this -
>>
>> Below is my first table -
>>
>>     CREATE TABLE customers (
>>       customer_id int PRIMARY KEY,
>>       last_modified_date timeuuid,
>>       customer_value text
>>     )
>>
>> Read query pattern is like this on above table as of now since we need to
>> get everything from above table and load it into our application memory
>> every x minutes.
>>
>>     select customer_id, customer_value from datakeyspace.customers;
>>
>> We have second table like this -
>>
>>     CREATE TABLE client_data (
>>       client_name text PRIMARY KEY,
>>       client_id text,
>>       creation_date timestamp,
>>       is_valid int,
>>       last_modified_date timestamp
>>     )
>>
>> Right now in the above table, we have 500 records and all those records
>> has "is_valid" column value set as 1. And the read query pattern is like
>> this on above table as of now since we need to get everything from above
>> table and load it into our application memory every x minutes so the below
>> query will return me all 500 records since everything has is_valid set to 1.
>>
>>     select client_name, client_id from  datakeyspace.client_data where
>> is_valid=1;
>>
>> Since our cluster is VNODES enabled so my above query pattern is not
>> efficient at all and it is taking lot of time to get the data from
>> Cassandra. We are reading from these table with consistency level QUORUM.
>>
>> Is there any possibility of improving our data model?
>>
>> Any suggestions will be greatly appreciated.
>>
>
>

Mime
View raw message