incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: lots of extra bytes on disk
Date Thu, 28 Mar 2013 18:10:46 GMT
I am confused.  I thought you said you don't have a snapshot.  Df/du
reports space used by existing data AND the snapshot.  Cassandra only
reports on space used by actual data........if you move the snapshots, does
df/du match what cassandra says?

Dean

On 3/28/13 12:05 PM, "Ben Chobot" <bench@instructure.com> wrote:

>.....though interestingly, the snapshot of these CFs have the "right"
>amount of data in them (i.e. it agrees with the live SSTable size
>reported by cassandra). Is it total insanity to remove the files from the
>data directory not included in the snapshot, so long as they were created
>before the snapshot?
>
>On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote:
>
>> Have you cleaned up your snapshotsÅ those take extra space and don't just
>> go away unless you delete them.
>> 
>> Dean
>> 
>> On 3/28/13 11:46 AM, "Ben Chobot" <bench@instructure.com> wrote:
>> 
>>> Are you also running 1.1.5? I'm wondering (ok hoping) that this might
>>>be
>>> fixed if I upgrade.
>>> 
>>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote:
>>> 
>>>> We occasionally (twice now on a 40 node cluster over the last 6-8
>>>> months) see this.  My best guess is that Cassandra can fail to mark an
>>>> SSTable for cleanup somehow.  Forced GC's or reboots don't clear them
>>>> out.  We disable thrift and gossip; drain; snapshot; shutdown; clear
>>>> data/Keyspace/Table/*.db and restore (hard-linking back into place to
>>>> avoid data transfer) from the just created snapshot; restart.
>>>> 
>>>> 
>>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <bench@instructure.com>
>>>>wrote:
>>>> 
>>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large
>>>>> discrepancy between what cassandra says the SSTables should sum up
>>>>>to,
>>>>> and what df and du claim exist. During repairs, this is almost always
>>>>> pretty bad, but post-repair compactions tend to bring those numbers
>>>>>to
>>>>> within a few percent of each other... usually. Sometimes they remain
>>>>> much further apart after compactions have finished - for instance,
>>>>>I'm
>>>>> looking at one node now that claims to have 205GB of SSTables, but
>>>>> actually has 450GB of files living in that CF's data directory. No
>>>>> pending compactions, and the most recent compaction for this CF
>>>>> finished just a few hours ago.
>>>>> 
>>>>> nodetool cleanup has no effect.
>>>>> 
>>>>> What could be causing these extra bytes, and how to get them to go
>>>>> away? I'm ok with a few extra GB of unexplained data, but an extra
>>>>> 245GB (more than all the data this node is supposed to have!) is a
>>>>> little extreme.
>>>> 
>>> 
>> 
>


Mime
View raw message