incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Katsak <>
Subject Provisioning/Configuration Question
Date Sat, 01 Mar 2014 14:30:05 GMT

I am doing some academic work with Cassandra 1.1.6 (I am on an older
version because of a bunch of implemented modifications that have been
in the works for a while), and I am wondering if the list can help me
resolve some questions I have.

I am running a cluster of 27 nodes with the following configuration:

Intel Atom (2 core) @ 1.8 GHz
250 GB HDD
Gigabit Ethernet

With this cluster size, I currently have loaded 135 GB of data
(replicated * 3), giving me data of ~15 GB per node. I am using Leveled
Compaction with a 5mb SSTable size. Commitlog is in HDD, data is on SSD.

My workload is YCSB/uniform distribution/75% read-25% write.

My questions are:

- Is this a reasonable data size for this hardware?

- What should be compaction throughput be set to?  I am targeting a 99th
percentile latency SLA, and it seems that compaction throughput greatly
affects the 99th percentile latency. The guideline seems to be 16-32x 
insertion rate, but this slows down the 99th percentile time 
dramatically. In addition, there seems to be a feedback loop where if 
you insert faster, you need more compaction, but if you had more 
compaction, you can't insert as fast. What is best practice on this?

- What is a reasonable operation throughput to expect from this

Sorry for the info dump, but I have been fighting with this for a while
now. I've tried to read everything I can about tuning and provisioning,
but continue to have an issue where I can find a load rate that hits my
99th percentile SLA on average, but have large latency spikes that don't
seem to match a pattern.

Thanks in advance for any advice you can give, even if it is just "go
read this document".


Bill Katsak
Ph.D. Student
Rutgers University

View raw message