incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aiman Parvaiz <ai...@grapheffect.com>
Subject Re: Cassandra performance decreases drastically with increase in data size.
Date Fri, 31 May 2013 06:47:25 GMT
I believe you should roll out more nodes as a temporary fix to your problem, 400GB on all nodes
means (as correctly mentioned in other mails of this thread) you are spending more time on
GC. Check out the second comment in this link by Aaron Morton, he says the more than 300GB
can be problematic, though this post is about older version of cassandra but I believe concept
still stands true:

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Is-it-safe-to-stop-a-read-repair-and-any-suggestion-on-speeding-up-repairs-td6607367.html

Thanks

On May 29, 2013, at 9:32 PM, srmore <comomore@gmail.com> wrote:

> Hello,
> I am observing that my performance is drastically decreasing when my data size grows.
I have a 3 node cluster with 64 GB of ram and my data size is around 400GB on all the nodes.
I also see that when I re-start Cassandra the performance goes back to normal and then again
starts decreasing after some time. 
> 
> Some hunting landed me to this page http://wiki.apache.org/cassandra/LargeDataSetConsiderations
which talks about the large data sets and explains that it might be because I am going through
multiple layers of OS cache, but does not tell me how to tune it.
> 
> So, my question is, are there any optimizations that I can do to handle these large datatasets
?
> 
> and why does my performance go back to normal when I restart Cassandra ?
> 
> Thanks !


Mime
View raw message