cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Piavlo <>
Subject not even number of keys per CFs in fully balanced cluster with random partitioner
Date Tue, 29 Oct 2013 23:09:30 GMT

There is a 12 node cluster , still stuck on 1.0.8.
All nodes in the cluster ring are balanced.
Using random partitioner.
All CFs use compression.
Data size on nodes varies from 40G to 75G.
This variance is not due to the bigger nodes having more uncompacted 
sstables than others.
Most biggest CFs have exact same row keys, just store different data, so 
data for same same key should end up on same node for these CFs.
The keys estimate for each of these biggest CF on the nodes with larger 
data size  is almost twice larger than key estimate on the nodes with 
smallest data size, thus proportional to the data size on the node. 
These CFs have about 50-100 millions for rows per node.

I can't understand how statistically it's possible that with random 
partitioner some nodes have x2 more keys than others with 50-100 
millions of keys per node.
Any ideas how it's possible?
Anything else I can check?


View raw message