cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wei Zhu <>
Subject read request distribution
Date Fri, 09 Nov 2012 05:37:22 GMT
Hi All,
I am doing a benchmark on a Cassandra. I have a three node cluster with RF=3. I generated
6M rows with sequence  number from 1 to 6m, so the rows should be evenly distributed among
the three nodes disregarding the replicates. 

I am doing a benchmark with read only requests, I generate read request for randomly generated
keys from 1 to 6M. Oddly, nodetool cfstats, reports that one node has only half the requests
as the other one and the third node sits in the middle. So the ratio is like 2:3:4. The node
with the most read requests actually has the smallest latency and the one with the least read
requests reports the largest latency. The difference is pretty big, the fastest is almost
double the slowest.

All three nodes have the exactly the same hardware and the data size on each node are the
same since the RF is three and all of them have the complete data. I am using Hector as client
and the random read request are in millions. I can't think of a reasonable explanation. 
Can someone please shed some lights?


View raw message