cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lanny Ripple <>
Subject Re: lots of extra bytes on disk
Date Thu, 28 Mar 2013 15:53:26 GMT
We occasionally (twice now on a 40 node cluster over the last 6-8 months) see this.  My best
guess is that Cassandra can fail to mark an SSTable for cleanup somehow.  Forced GC's or reboots
don't clear them out.  We disable thrift and gossip; drain; snapshot; shutdown; clear data/Keyspace/Table/*.db
and restore (hard-linking back into place to avoid data transfer) from the just created snapshot;

On Mar 28, 2013, at 10:12 AM, Ben Chobot <> wrote:

> Some of my cassandra nodes in my 1.1.5 cluster show a large discrepancy between what
cassandra says the SSTables should sum up to, and what df and du claim exist. During repairs,
this is almost always pretty bad, but post-repair compactions tend to bring those numbers
to within a few percent of each other... usually. Sometimes they remain much further apart
after compactions have finished - for instance, I'm looking at one node now that claims to
have 205GB of SSTables, but actually has 450GB of files living in that CF's data directory.
No pending compactions, and the most recent compaction for this CF finished just a few hours
> nodetool cleanup has no effect.
> What could be causing these extra bytes, and how to get them to go away? I'm ok with
a few extra GB of unexplained data, but an extra 245GB (more than all the data this node is
supposed to have!) is a little extreme.

View raw message