incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kirk True <k...@mustardgrain.com>
Subject Re: read request distribution
Date Tue, 13 Nov 2012 00:24:02 GMT
Somewhat recently the Ownership column was changed to Effective
Ownership.



Previously the formula was essentially 100/<nodes>. Now it's
100*<replication factor>/<nodes>. So in previous releases of Cassandra
it would be 100/12 = 8.33, now it would be closer to 25% (8.33*3
(assuming a replication factor of three)).



Kirk



On Mon, Nov 12, 2012, at 03:52 PM, Ananth Gundabattula wrote:

Hi all,

On an unrelated observation of the below readings, it looks like all
the 3 nodes own 100% of the data. This confuses me a bit. We have a 12
node cluster with RF=3 but the effective ownership is shown as 8.33 %
.

So here is my question. How is the ownership calculated : Is Replica
factor considered in the ownership calculation ? ( If yes , then 8.33 %
ownership of a cluster seems wrong to me . If not 100% ownership for a
node cluster seems wrong to me. Am I missing something in the
calculation?

Regards,
Ananth

On Fri, Nov 9, 2012 at 4:37 PM, Wei Zhu <[1]wz1975@yahoo.com> wrote:

Hi All,
I am doing a benchmark on a Cassandra. I have a three node cluster with
RF=3. I generated 6M rows with sequence  number from 1 to 6m, so the
rows should be evenly distributed among the three nodes disregarding
the replicates.
I am doing a benchmark with read only requests, I generate read request
for randomly generated keys from 1 to 6M. Oddly, nodetool cfstats,
reports that one node has only half the requests as the other one and
the third node sits in the middle. So the ratio is like 2:3:4. The node
with the most read requests actually has the smallest latency and the
one with the least read requests reports the largest latency. The
difference is pretty big, the fastest is almost double the slowest.
All three nodes have the exactly the same hardware and the data size on
each node are the same since the RF is three and all of them have the
complete data. I am using Hector as client and the random read request
are in millions. I can't think of a reasonable explanation.  Can
someone please shed some lights?

Thanks.
-Wei

References

1. mailto:wz1975@yahoo.com

Mime
View raw message