incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Batranut Bogdan <batra...@yahoo.com>
Subject Re: Cassandra slow on some reads
Date Fri, 14 Mar 2014 15:15:29 GMT
Hello,

I can't go this way... this cf will be used for time ranges. 



On Friday, March 14, 2014 5:10 PM, "Laing, Michael" <michael.laing@nytimes.com> wrote:
 
If you do not need to do range queries on your 'timestam' (ts) column - and if you can change
your schema (big if...), then you could move 'timestam' into the partition key like this (using
your notation):

PK((key String , timestam int), column1 string, col2 string) , list1 , list 2, list 3 .


Now the select query you showed should execute more consistently.


But of course something else might break...!

ml



On Fri, Mar 14, 2014 at 8:50 AM, Batranut Bogdan <batranub@yahoo.com> wrote:

Hello all,
>
>
>Here is the environment:
>
>
>I have a 6 node Cassandra cluster. On each node I have:
>- 32 G RAM
>- 24 G RAM for cassa
>- ~150 - 200 MB/s disk speed
>- tomcat 6 with axis2 webservice that uses the datastax java driver to make
>asynch reads / writes 
>- replication factor for the keyspace is 3
>
>(I know that there is a lot of heap but I
 also have write heavy tasks and I want them to get into mem fast) .
>
>All nodes in the same data center 
>The clients that read / write are in the same datacenter so network is Gigabit.
>
>
>
>The table structure is like this: PK(key String , timestam int, column1 string, col2 string)
, list1 , list 2, list 3 .
>There are about 300 milions individual keys.
>There are about 100 timestamps for each key now, so the rows will get wider as time passes.
>
>
>I am using datastax java driver to query the cluster.
>
>
>I have ~450 queries that are like this: SELECT * FROM table where key = 'some string'
and ts = some value; some value is close to present time.
>
>
>The problem:
>
>
>About 10 - 20 % of these queries take more than 5 seconds to execute, in fact, the majority
of those take around 10 seconds.
>When investigating I saw that if I have a slow response and I redo the query it will finish
in 8 - 10 MILIseconds like the rest of the queries that I have. 
>I could not see using JConsole any spikes in CPU / memory when executing the queries.
The rise in resource consumtion is very small on all nodes on the cluster. I expect such delays
to be generated by a BIG increase in resource consumption.
>
>
>Any comments will be appreciated.
>
>
>Thank you.
>
>
> 
>
>
>
>
Mime
View raw message