On Wed, Feb 3, 2010 at 4:19 AM, envio user <enviouser@gmail.com> wrote:
After this I tried with 1 million keys:

/home/sun>python stress.py -n 1000000 -t 100 -c 25 -r -o read -i 10
WARNING: multiprocessing not present, threading will be used.
       Benchmark may not be accurate!
total,interval_op_rate,avg_latency,elapsed_time
.......................
.......................
87916,76,1.30730833113,1240
88665,74,1.33158908508,1250
89405,74,1.35333179654,1260
90086,68,1.45503252228,1270
90745,65,1.51978417774,1280
91476,73,1.38719448671,1290
92226,75,1.3288515962,1300
92976,75,1.33220300897,1310
93770,79,1.26187492288,1320
94557,78,1.26394684554,1330

1M rows means you've stored 600M columns, which is around 32G of data after compaction.  With 8G of memory for disk cache from the OS, minus at least 1G going to Cassandra's JVM, your machine is going to have to do a disk seek on at least nearly 80% of your reads.  You can improve this with either more memory per node, or more nodes, but it's worth noting that ~1875 columns/sec isn't too bad for this situation, and you can probably read all 600 columns per row at nearly the same speed.

-Brandon