On Tue, Feb 16, 2010 at 12:16 PM, Weijun Li <weijunli@gmail.com> wrote:
Thanks for for DataFileDirectory trick and I'll give a try.

Just noticed the impact of number of data files: node A has 13 data files with read latency of 20ms and node B has 27 files with read latency of 60ms. After I ran "nodeprobe compact" on node B its read latency went up to 150ms. The read latency of node A became as low as 10ms. Is this normal behavior? I'm using random partitioner and the hardware/JVM settings are exactly the same for these two nodes.

It sounds like the latency jumped to 150ms because the newly written file was not in the OS cache. 

Another problem is that Java heap usage is always 900mb out of 6GB? Is there any way to utilize all of the heap space to decrease the read latency?

By default, Cassandra will use a 1GB heap, as set in bin/cassandra.in.sh.  You can adjust the jvm heap there via the -Xmx option, but generally you want to balance the jvm vs the OS cache.  With 6GB, I would probably give 2GB to the jvm, but if you aren't having issues now increasing the jvm's memory probably won't provide any performance gains, but it's worth noting that with row cache in 0.6 this may change.