incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: nodetool repair uses insane amount of disk space
Date Thu, 16 Aug 2012 22:57:57 GMT
What version are using ? There were issues with repair using lots-o-space in 0.8.X, it's fixed
in 1.X


Aaron Morton
Freelance Developer

On 17/08/2012, at 2:56 AM, Michael Morris <> wrote:

> Occasionally as I'm doing my regular anti-entropy repair I end up with a node that uses
an exceptional amount of disk space (node should have about 5-6 GB of data on it, but ends
up with 25+GB, and consumes the limited amount of disk space I have available)
> How come a node would consume 5x its normal data size during the repair process?
> My setup is kind of strange in that it's only about 80-100GB of data on a 35 node cluster,
with 2 data centers and 3 racks, however the rack assignments are unbalanced.  One data center
has 8 nodes, and the other data center is split into 2 racks with one rack of 9 nodes, and
the other with 18 nodes.  However, within each rack, the tokens are distributed equally. It's
a long sad story about how we ended up this way, but it basically boils down to having to
utilize existing resources to resolve a production issue.
> Additionally, the repair process takes (what I feel is) an extremely long time to complete
(36+ hours), and it always seems that nodes are streaming data to each other, even on back-to-back
executions of the repair.
> Any help on these issues is appreciated.
> - Mike

View raw message