incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Doubleday <>
Subject Re: Read Latency Degradation
Date Fri, 17 Dec 2010 10:07:18 GMT

On Dec 16, 2010, at 11:35 PM, Wayne wrote:

> I have read that read latency goes up with the total data size, but to what degree should
we expect a degradation in performance? What is the "normal" read latency range if there is
such a thing for a small slice of scol/cols? Can we really put 2TB of data on a node and get
good read latency querying data off of a handful of CFs? Any experience or explanations would
be greatly appreciated. 

If you really mean 2TB per node I strongly advise you to perform thorough testing with real
world column sizes and the read write load you expect. Try to load test at least with a test
cluster / data that represents one replication group. I.e. RF=3 -> 3 nodes. And test with
the consistency level you want to use. Also test ring operations (repair, adding nodes, moving
nodes) while under expected load/

Combined with 'a handful of CFs' I would assume that you are expecting a considerable write
load. You will get massive compaction load and with that data size the file system cache will
suffer big time. You'll need loads of RAM and still ...

I can only speak about 0.6 but ring management operations will become a nightmare and you
will have very long running repairs. 

The cluster behavior changes massively with different access patterns (cold vs warm data)
and data sizes. So you have to understand yours and test it. I think most generic load tests
are mainly marketing instruments and I believe this is especially true for cassandra. 

Don't want to sound negative (I am a believer and don't regret our investment) but cassandra
is no silver bullet. You really need to know what you are doing.

View raw message