incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: lots of extra bytes on disk
Date Thu, 28 Mar 2013 18:10:46 GMT
I am confused.  I thought you said you don't have a snapshot.  Df/du
reports space used by existing data AND the snapshot.  Cassandra only
reports on space used by actual data........if you move the snapshots, does
df/du match what cassandra says?


On 3/28/13 12:05 PM, "Ben Chobot" <> wrote:

>.....though interestingly, the snapshot of these CFs have the "right"
>amount of data in them (i.e. it agrees with the live SSTable size
>reported by cassandra). Is it total insanity to remove the files from the
>data directory not included in the snapshot, so long as they were created
>before the snapshot?
>On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote:
>> Have you cleaned up your snapshotsÅ those take extra space and don't just
>> go away unless you delete them.
>> Dean
>> On 3/28/13 11:46 AM, "Ben Chobot" <> wrote:
>>> Are you also running 1.1.5? I'm wondering (ok hoping) that this might
>>> fixed if I upgrade.
>>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote:
>>>> We occasionally (twice now on a 40 node cluster over the last 6-8
>>>> months) see this.  My best guess is that Cassandra can fail to mark an
>>>> SSTable for cleanup somehow.  Forced GC's or reboots don't clear them
>>>> out.  We disable thrift and gossip; drain; snapshot; shutdown; clear
>>>> data/Keyspace/Table/*.db and restore (hard-linking back into place to
>>>> avoid data transfer) from the just created snapshot; restart.
>>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <>
>>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large
>>>>> discrepancy between what cassandra says the SSTables should sum up
>>>>> and what df and du claim exist. During repairs, this is almost always
>>>>> pretty bad, but post-repair compactions tend to bring those numbers
>>>>> within a few percent of each other... usually. Sometimes they remain
>>>>> much further apart after compactions have finished - for instance,
>>>>> looking at one node now that claims to have 205GB of SSTables, but
>>>>> actually has 450GB of files living in that CF's data directory. No
>>>>> pending compactions, and the most recent compaction for this CF
>>>>> finished just a few hours ago.
>>>>> nodetool cleanup has no effect.
>>>>> What could be causing these extra bytes, and how to get them to go
>>>>> away? I'm ok with a few extra GB of unexplained data, but an extra
>>>>> 245GB (more than all the data this node is supposed to have!) is a
>>>>> little extreme.

View raw message