cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Determining Cassandra System/Memory Requirements
Date Mon, 03 May 2010 22:56:44 GMT
Short answer: no, there is no formula into which you can plug numbers.

Longer answer: benchmark with a subset of your data and extrapolate.
The closer the test data is to real data, the more accurate it will
be.  Yes, compaction is O(N) wrt the amount of data in the system, so
don't do it more than necessary (increase memtable flush thresholds;
go easy on nodetool compact).

On Mon, May 3, 2010 at 4:34 PM, Jon Graham <sjcloud22@gmail.com> wrote:
> Hello Everyone,
>
> Is there a practical formula for determining Cassandra system requirements
> using OrderPreservingPartitioner ?
>
> We have hundreds of millions of rows in a single column family with a
> potential target of maybe a billion rows.
>
> How can we estimate the Cassandra system requirements given factors such as:
>
> N=number of nodes
> M=memory allocated for Cassandra
> R=replication factor
> K=key size
> D=individual column data size
> CR=columns/row
> NR=number of rows (keys) in column family
>
> It seems like the compaction process gets more stressed as we add more data,
> but I have no idea how close we are
> to a breaking point.
>
> Thanks,
> Jon
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message