cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Fleming <>
Subject Node repair : excessive data
Date Mon, 12 Dec 2011 21:47:53 GMT

We simulated a node 'failure' on one of our nodes by deleting the entire
Cassandra installation directory & reconfiguring a fresh instance with the
same token.  When we issued a 'repair' it started streaming data back onto
the node as expected.

However after the repair completed, we had over 2.5 times the original
load.  Issuing a 'cleanup' reduced this to about 1.5 times the original
load.  We observed an increase in the number of keys via 'cfstats' which is
obviously accounting for the increased load.

Would anybody know why the repair pulled more keys in than it had initially
with the same token?  How can we avoid this recurring?

If we didn't have sufficient headroom on the disk to handle say 3 times the
load, we could be in a difficult situation should we experience a genuine

(we're using Cassandra 1.0.5, 12 nodes split across 2 data centres, total
cluster load during testing was about 150GB)

Many thanks,


View raw message