Does this "max load" have correlation to replication factor?

IE a 3 node cluster with rf of 3. Should i be worried at {max load} X 3 or what people generally mention the max load is?

On Thu, Jun 7, 2012 at 10:55 PM, Filippo Diotalevi <filippo@ntoklo.com> wrote:
Hi,
one of latest Aaron's observation about the max load per Cassandra node caught my attention
At ~840GB I'm probably running close
to the max load I should have on a node,
[AM] roughly 300GB to 400GB is the max load
Since we currently have a Cassandra node with roughly 330GB of data, it looks like that's a good time for us to really understand what's that limit in our case. Also, a (maybe old) Stackoverflow question at http://stackoverflow.com/questions/4775388/how-much-data-per-node-in-cassandra-cluster , seems to suggest a higher limit per node.

Just considering the compaction issues, what are the factors we need to account to determine the max load? 

* disk space
Datastax cassandra docs state (pg 97) that a major compaction "temporarily doubles disk space usage". Is it a safe estimate to say that the Cassandra machine needs to have roughly the same amount of free disk space as the current load of the Cassandra node, or are there any other factor to consider?

* RAM
Is the amount of RAM in the machine (or dedicated to the Cassandra node) affecting in any way the speed/efficiency of the compaction process? 

* Performance degradation for overloaded nodes?
What kind of performance degradation can we expect for a Cassandra node which is "overloaded"? (f.i. with 500GB or more of data)


Thanks for the clarifications,
-- 
Filippo Diotalevi





--
-Ben