incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <Dean.Hil...@nrel.gov>
Subject Re: lots of extra bytes on disk
Date Thu, 28 Mar 2013 18:35:14 GMT
We had a runaway STCS like this due to our own mistakes but were not sure
how to clean it up.  We went to LCS instead of STCS and that seemed to
bring it way back down since the STCS had repeats and such between
SSTables which LCS avoids mostly.  I can't help much more than that info
though.

Dean

On 3/28/13 12:31 PM, "Ben Chobot" <bench@instructure.com> wrote:

>Sorry to make it confusing. I didn't have snapshots on some nodes; I just
>made a snapshot on a node with this problem.
>
>So to be clear, on this one example node....
> Cassandra reports ~250GB of space used
> In a CF data directory (before snapshots existed), du -sh showed ~550GB
> After the snapshot, du in the same directory still showed ~550GB
>(they're hard links, so that's correct)
> du in the snapshot directory for that CF shows ~250GB, and ls shows ~50
>fewer files.
>
>
>
>On Mar 28, 2013, at 11:10 AM, Hiller, Dean wrote:
>
>> I am confused.  I thought you said you don't have a snapshot.  Df/du
>> reports space used by existing data AND the snapshot.  Cassandra only
>> reports on space used by actual data........if you move the snapshots,
>>does
>> df/du match what cassandra says?
>> 
>> Dean
>> 
>> On 3/28/13 12:05 PM, "Ben Chobot" <bench@instructure.com> wrote:
>> 
>>> .....though interestingly, the snapshot of these CFs have the "right"
>>> amount of data in them (i.e. it agrees with the live SSTable size
>>> reported by cassandra). Is it total insanity to remove the files from
>>>the
>>> data directory not included in the snapshot, so long as they were
>>>created
>>> before the snapshot?
>>> 
>>> On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote:
>>> 
>>>> Have you cleaned up your snapshotsÅ those take extra space and don't
>>>>just
>>>> go away unless you delete them.
>>>> 
>>>> Dean
>>>> 
>>>> On 3/28/13 11:46 AM, "Ben Chobot" <bench@instructure.com> wrote:
>>>> 
>>>>> Are you also running 1.1.5? I'm wondering (ok hoping) that this might
>>>>> be
>>>>> fixed if I upgrade.
>>>>> 
>>>>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote:
>>>>> 
>>>>>> We occasionally (twice now on a 40 node cluster over the last 6-8
>>>>>> months) see this.  My best guess is that Cassandra can fail to mark
>>>>>>an
>>>>>> SSTable for cleanup somehow.  Forced GC's or reboots don't clear
>>>>>>them
>>>>>> out.  We disable thrift and gossip; drain; snapshot; shutdown; clear
>>>>>> data/Keyspace/Table/*.db and restore (hard-linking back into place
>>>>>>to
>>>>>> avoid data transfer) from the just created snapshot; restart.
>>>>>> 
>>>>>> 
>>>>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <bench@instructure.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large
>>>>>>> discrepancy between what cassandra says the SSTables should sum
up
>>>>>>> to,
>>>>>>> and what df and du claim exist. During repairs, this is almost
>>>>>>>always
>>>>>>> pretty bad, but post-repair compactions tend to bring those numbers
>>>>>>> to
>>>>>>> within a few percent of each other... usually. Sometimes they
>>>>>>>remain
>>>>>>> much further apart after compactions have finished - for instance,
>>>>>>> I'm
>>>>>>> looking at one node now that claims to have 205GB of SSTables,
but
>>>>>>> actually has 450GB of files living in that CF's data directory.
No
>>>>>>> pending compactions, and the most recent compaction for this
CF
>>>>>>> finished just a few hours ago.
>>>>>>> 
>>>>>>> nodetool cleanup has no effect.
>>>>>>> 
>>>>>>> What could be causing these extra bytes, and how to get them
to go
>>>>>>> away? I'm ok with a few extra GB of unexplained data, but an
extra
>>>>>>> 245GB (more than all the data this node is supposed to have!)
is a
>>>>>>> little extreme.
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>


Mime
View raw message