~ 70 million keys (20 bytes each using Random Partitioner)  1.4GB of key data + the structures to support it. Which seems a good bit smaller than the 32GB of RAM available on the 4 machines.  How many machines should it take to 2-3000 lookups/second?

 

From: Brandon Williams [mailto:driftx@gmail.com]
Sent: Wednesday, May 05, 2010 7:04 PM
To: user@cassandra.apache.org
Subject: Re: performance tuning - where does the slowness come from?

 

On Wed, May 5, 2010 at 6:59 PM, Mark Jones <MJones@imagehawk.com> wrote:

My data is single row/key to a 500 byte column and I’m reading ALL random keys (worst case read scenario)  Cache has minimal effectiveness, so the Bloom trees and indexes are getting a real work out.  I’m on 8GB Ubuntu 9.10 boxes (64bit).  Yea, I was griping about the performance earlier, disk is heavily used by Cassandra, so outside of going to some highend SAS stuff, not sure what to do.

 

How many keys?  If your data size is exceeding your OS's cache capacity (8GB - JVM size) then a completely random read pattern is mostly going to test how fast your disk can seek.  You can try to use faster disks, but the better solution is to add more nodes.

 

-Brandon