incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Maximum load per node
Date Thu, 07 Jun 2012 21:37:38 GMT
It's not a hard rule, you can put more data on a node. The 300GB to 400GB idea is mostly concerned
with operations, you may want to put less on a node due to higher throughput demands. 

(We are talking about the amount of data on a node, regardless of the RF). 

On the operations side the considerations are:

* If you want to move the node to a new host moving 400 GB at 35MB/sec takes about 3 to 4
 hours (this is the speed I recently got for moving 500GB on AWS in the same AZ)

* Repair will need to process all of the data. Assuming the bottle neck is not the CPU, and
there are no other background processes running, it will take 7 hours to read the data at
the default 16MB/sec (compaction_throughput_mb_per_sec).  

* Some throughput considerations for compaction.

* Major compaction compacts all the sstables, and assumes that it needs that much space again
to write the new file. We normally dont want to do major compactions though. 

* If you are in a situation where you have lost redundancy for all or part of the key ring,
you will want to get new nodes online ASAP. Taking several hours to bring new nodes on may
not be acceptable. 

* The more data on disk the memory needed. The memory is taken up by bloom filters and index
sampling. These can be tuned to reduce the memory footprint, with potential reduction in read

* Using compression helps reduce the on disk, and makes some things run faster. My experience
is that is that repair and compaction will still take a while, as they deal with the uncompressed

* Startup time for index sampling is/was an issue (it's faster in 1.1). If the node has more
memory and more disk the time to get the page cache hot will increase.

* As the amount of data per node goes up, potentially so does the working set of hot data.
If the memory per node available for the page cache remains the same potentially the latency
will increase. e.g.  3 nodes with 800Gb each has less memory for the hot set than 6 nodes
with 400GB each.

It's just a rule of thumb to avoid getting into trouble. Where trouble is often "help something
went wrong and it's takes ages to fix" or "why does X take forever" or "why does it use Y
amount of memory". If you are aware of the issues, there is essentially no upper limit on
how much data you can put on a node. 

Hope that helps. 

Aaron Morton
Freelance Developer

On 8/06/2012, at 12:59 AM, Ben Kaehne wrote:

> Does this "max load" have correlation to replication factor?
> IE a 3 node cluster with rf of 3. Should i be worried at {max load} X 3 or what people
generally mention the max load is?
> On Thu, Jun 7, 2012 at 10:55 PM, Filippo Diotalevi <> wrote:
> Hi,
> one of latest Aaron's observation about the max load per Cassandra node caught my attention
>> At ~840GB I'm probably running close
>> to the max load I should have on a node,
> [AM] roughly 300GB to 400GB is the max load
> Since we currently have a Cassandra node with roughly 330GB of data, it looks like that's
a good time for us to really understand what's that limit in our case. Also, a (maybe old)
Stackoverflow question at
, seems to suggest a higher limit per node.
> Just considering the compaction issues, what are the factors we need to account to determine
the max load? 
> * disk space
> Datastax cassandra docs state (pg 97) that a major compaction "temporarily doubles disk
space usage". Is it a safe estimate to say that the Cassandra machine needs to have roughly
the same amount of free disk space as the current load of the Cassandra node, or are there
any other factor to consider?
> * RAM
> Is the amount of RAM in the machine (or dedicated to the Cassandra node) affecting in
any way the speed/efficiency of the compaction process? 
> * Performance degradation for overloaded nodes?
> What kind of performance degradation can we expect for a Cassandra node which is "overloaded"?
(f.i. with 500GB or more of data)
> Thanks for the clarifications,
> -- 
> Filippo Diotalevi
> -- 
> -Ben

View raw message