incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Katsak <wkat...@cs.rutgers.edu>
Subject Re: Provisioning/Configuration Question
Date Sat, 01 Mar 2014 20:01:31 GMT
Hi Tim,

On 03/01/2014 02:02 PM, Tim Wintle wrote:
> 137GB would fairly easily fit in core memory on a single node these
> days: so it seems a very low amount for a 27 node cluster..
>

Note that we only have 4 GB of RAM per node, so only 1 GB of Cassandra 
heap. Are you assuming large memory servers, or am I misunderstanding you?

> Off the top of my head: would 99th percentile latency be improved by
> using replication factor 5, assuming you are doing quorum operations..
>

We are currently analyzing the case with reads/writes at consistency 
level ONE, so I don't think increasing the replication factor will help 
us right now.

I found this doc last night:

http://software.intel.com/sites/default/files
/Configuration_and_Deployment_Guide_for_Cassandra_on_IA.pdf

The numbers that they quote for data size seem to be quite low as well.

Thoughts?

-Bill

> Sent from my phone
>
> On 1 Mar 2014 14:33, "William Katsak" <wkatsak@cs.rutgers.edu
> <mailto:wkatsak@cs.rutgers.edu>> wrote:
>
>     Hello,
>
>     I am doing some academic work with Cassandra 1.1.6 (I am on an older
>     version because of a bunch of implemented modifications that have been
>     in the works for a while), and I am wondering if the list can help me
>     resolve some questions I have.
>
>     I am running a cluster of 27 nodes with the following configuration:
>
>     Intel Atom (2 core) @ 1.8 GHz
>     4 GB RAM
>     250 GB HDD
>     64 GB SSD
>     Gigabit Ethernet
>
>     With this cluster size, I currently have loaded 135 GB of data
>     (replicated * 3), giving me data of ~15 GB per node. I am using Leveled
>     Compaction with a 5mb SSTable size. Commitlog is in HDD, data is on SSD.
>
>     My workload is YCSB/uniform distribution/75% read-25% write.
>
>     My questions are:
>
>     - Is this a reasonable data size for this hardware?
>
>     - What should be compaction throughput be set to?  I am targeting a 99th
>     percentile latency SLA, and it seems that compaction throughput greatly
>     affects the 99th percentile latency. The guideline seems to be
>     16-32x insertion rate, but this slows down the 99th percentile time
>     dramatically. In addition, there seems to be a feedback loop where
>     if you insert faster, you need more compaction, but if you had more
>     compaction, you can't insert as fast. What is best practice on this?
>
>     - What is a reasonable operation throughput to expect from this
>     configuration?
>
>     Sorry for the info dump, but I have been fighting with this for a while
>     now. I've tried to read everything I can about tuning and provisioning,
>     but continue to have an issue where I can find a load rate that hits my
>     99th percentile SLA on average, but have large latency spikes that don't
>     seem to match a pattern.
>
>     Thanks in advance for any advice you can give, even if it is just "go
>     read this document".
>
>     Sincerely,
>
>     Bill Katsak
>     Ph.D. Student
>     Rutgers University
>

Mime
View raw message