From Ben Chobot <>
Subject lots of extra bytes on disk
Date Thu, 28 Mar 2013 15:12:28 GMT
Some of my cassandra nodes in my 1.1.5 cluster show a large discrepancy between what cassandra
says the SSTables should sum up to, and what df and du claim exist. During repairs, this is
almost always pretty bad, but post-repair compactions tend to bring those numbers to within
a few percent of each other... usually. Sometimes they remain much further apart after compactions
have finished - for instance, I'm looking at one node now that claims to have 205GB of SSTables,
but actually has 450GB of files living in that CF's data directory. No pending compactions,
and the most recent compaction for this CF finished just a few hours ago.

nodetool cleanup has no effect.

What could be causing these extra bytes, and how to get them to go away? I'm ok with a few
extra GB of unexplained data, but an extra 245GB (more than all the data this node is supposed
to have!) is a little extreme.
