incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor <i...@4friends.od.ua>
Subject Re: High performance disk io
Date Wed, 22 May 2013 14:48:34 GMT
On 05/22/2013 05:41 PM, Christopher Wirt wrote:
>
> Hi Igor,
>
> Yea same here, 15ms for 99^th percentile is our max. Currently getting 
> one or two ms for most CF. It goes up at peak times which is what we 
> want to avoid.
>
Our 99 percentile also goes up at peak times but stay at acceptable level.

> We're using Cass 1.2.4 w/vnodes and our own barebones driver on top of 
> thrift. Needed to be .NET so Hector and Astyanax were not options.
>
Astyanax is token-aware, so we avoid extra data hops between cassandra 
nodes.

> Do you use SSDs or multiple SSDs in any kind of configuration or RAID?
>

No, single SSD per host

> Thanks
>
> Chris
>
> *From:*Igor [mailto:igor@4friends.od.ua]
> *Sent:* 22 May 2013 15:07
> *To:* user@cassandra.apache.org
> *Subject:* Re: High performance disk io
>
> Hello
>
> What level of read performance do you expect? We have limit 15 ms for 
> 99 percentile with average read latency near 0.9ms. For some CF 99 
> percentile actually equals to 2ms, for other - to 10ms, this depends 
> on the data volume you read in each query.
>
> Tuning read performance involved cleaning up data model, tuning 
> cassandra.yaml, switching from Hector to astyanax, tuning OS parameters.
>
> On 05/22/2013 04:40 PM, Christopher Wirt wrote:
>
>     Hello,
>
>     We're looking at deploying a new ring where we want the best
>     possible read performance.
>
>     We've setup a cluster with 6 nodes, replication level 3, 32Gb of
>     memory, 8Gb Heap, 800Mb keycache, each holding 40/50Gb of data on
>     a 200Gb SSD and 500Gb SATA for OS and commitlog
>
>     Three column families
>
>     ColFamily1 50% of the load and data
>
>     ColFamily2 35% of the load and data
>
>     ColFamily3 15% of the load and data
>
>     At the moment we are still seeing around 20% disk utilisation and
>     occasionally as high as 40/50% on some nodes at peak time.. we are
>     conducting some semi live testing.
>
>     CPU looks fine, memory is fine, keycache hit rate is about 80%
>     (could be better, so maybe we should be increasing the keycache size?)
>
>     Anyway, we're looking into what we can do to improve this.
>
>     One conversion we are having at the moment is around the SSD disk
>     setup..
>
>     We are considering moving to have 3 smaller SSD drives and
>     spreading the data across those.
>
>     The possibilities are:
>
>     -We have a RAID0 of the smaller SSDs and hope that improves
>     performance.
>
>     Will this acutally yield better throughput?
>
>     -We mount the SSDs to different directories and define multiple
>     data directories in Cassandra.yaml.
>
>     Will not having a layer of RAID controller improve the throughput?
>
>     -We mount the SSDs to different columns family directories and
>     have a single data directory declared in Cassandra.yaml.
>
>     Think this is quite attractive idea.
>
>     What are the drawbacks? System column families will be on the main
>     SATA?
>
>     -We don't change anything and just keep upping our keycache.
>
>     -Anything you guys can think of.
>
>     Ideas and thoughts welcome. Thanks for your time and expertise.
>
>     Chris
>


Mime
View raw message