cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Giorgos Margaritis <>
Subject get_range_slices() latency
Date Sat, 09 Jun 2012 15:00:28 GMT
Hi all,

I'm using ycsb to test Cassanda's performance on key range gets. I have
ycsb on one node and latest Cassandra server on another node. Using one
thread, I insert 10GB of uniformly random keys in Cassandra using ycsb,
performing range gets (get_range_slices) (every 1000 puts, I perform 1 range
get). Keys are 100 bytes, values 1KB. Each range get retrieves a random
number of entries between 1 and 100. Cassandra node has 3GB RAM and
one 7500RPM SATA disk. I use default configuration for Cassandra.
I have also replayed the experiment above with 10 threads instead of one.

I calculate the time needed for each get_range_slices() both in Cassandra
in ycsb. I was surprised with the extremely low latencies I got, and I'm not
sure I understand why (or if they are correct). E.g. I see 5ms latencies,
even lower than disk seek latencies, when I know that since there are
multiple files on disk, Cassandra should check in all files to satisfy a
get_range_slices() call (BF are of no use). Since node has 3GB RAM
and I insert 10GB of data, there is no possibility data is cached in memory
and calls are satisfied from there.

So, since there are multiple files on disk (I don't know if leveldb
are default or not, but in either case there are more than one disk files
should be checked for each range get), and since -at least after inserting
3-4GB- each get_range_slices() must touch disk, and must touch more
than one files, is it possible for a get_range_slices() to be satisfied in
Am I missing something?


View raw message