cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Zhu <>
Subject Re: High performance disk io
Date Wed, 22 May 2013 19:36:12 GMT
without VNodes, during repair -pr, it will stream data for all the replicates and repair all
of them. So it will impact RF number of nodes. 
In the case of VNodes, the streaming/compaction should happen to all the physical nodes. I
heard the repair is even worse for VNodes.... Test it and see how it goes. 

----- Original Message -----

From: "Dean Hiller" <> 
To:, "Wei Zhu" <> 
Sent: Wednesday, May 22, 2013 12:19:44 PM 
Subject: Re: High performance disk io 

If you are only running repair on one node, should it not skip that node? So there should
be no performance hit except when doing CL_ALL of course. We had to make a change to cassandra
or slow nodes did impact us previously. 


From: Wei Zhu <<>> 
Reply-To: "<>" <<>>,
Wei Zhu <<>> 
Date: Wednesday, May 22, 2013 1:16 PM 
To: "<>" <<>>

Subject: Re: High performance disk io 

For us, the biggest killer is repair and compaction following repair. If you are running VNodes,
you need to test the performance while running repair. 

From: "Igor" <<>> 
Sent: Wednesday, May 22, 2013 7:48:34 AM 
Subject: Re: High performance disk io 

On 05/22/2013 05:41 PM, Christopher Wirt wrote: 
Hi Igor, 

Yea same here, 15ms for 99th percentile is our max. Currently getting one or two ms for most
CF. It goes up at peak times which is what we want to avoid. 

Our 99 percentile also goes up at peak times but stay at acceptable level. 

We’re using Cass 1.2.4 w/vnodes and our own barebones driver on top of thrift. Needed to
be .NET so Hector and Astyanax were not options. 

Astyanax is token-aware, so we avoid extra data hops between cassandra nodes. 

Do you use SSDs or multiple SSDs in any kind of configuration or RAID? 

No, single SSD per host 



From: Igor [] 
Sent: 22 May 2013 15:07 
Subject: Re: High performance disk io 


What level of read performance do you expect? We have limit 15 ms for 99 percentile with average
read latency near 0.9ms. For some CF 99 percentile actually equals to 2ms, for other - to
10ms, this depends on the data volume you read in each query. 

Tuning read performance involved cleaning up data model, tuning cassandra.yaml, switching
from Hector to astyanax, tuning OS parameters. 

On 05/22/2013 04:40 PM, Christopher Wirt wrote: 

We’re looking at deploying a new ring where we want the best possible read performance.

We’ve setup a cluster with 6 nodes, replication level 3, 32Gb of memory, 8Gb Heap, 800Mb
keycache, each holding 40/50Gb of data on a 200Gb SSD and 500Gb SATA for OS and commitlog

Three column families 
ColFamily1 50% of the load and data 
ColFamily2 35% of the load and data 
ColFamily3 15% of the load and data 

At the moment we are still seeing around 20% disk utilisation and occasionally as high as
40/50% on some nodes at peak time.. we are conducting some semi live testing. 
CPU looks fine, memory is fine, keycache hit rate is about 80% (could be better, so maybe
we should be increasing the keycache size?) 

Anyway, we’re looking into what we can do to improve this. 

One conversion we are having at the moment is around the SSD disk setup.. 

We are considering moving to have 3 smaller SSD drives and spreading the data across those.

The possibilities are: 
-We have a RAID0 of the smaller SSDs and hope that improves performance. 
Will this acutally yield better throughput? 

-We mount the SSDs to different directories and define multiple data directories in Cassandra.yaml.

Will not having a layer of RAID controller improve the throughput? 

-We mount the SSDs to different columns family directories and have a single data directory
declared in Cassandra.yaml. 
Think this is quite attractive idea. 
What are the drawbacks? System column families will be on the main SATA? 

-We don’t change anything and just keep upping our keycache. 
-Anything you guys can think of. 

Ideas and thoughts welcome. Thanks for your time and expertise. 


View raw message