incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hiller, Dean" <>
Subject Re: lots of extra bytes on disk
Date Thu, 28 Mar 2013 18:40:21 GMT
Oh and since our LCS was 10MB per file it was easy to tell which files did
not convert yet.  Also, we ended up blowing away a CF on node 5(of 6) and
running a full repair on that CF and after he was at a normal size again
as well.


On 3/28/13 12:35 PM, "Hiller, Dean" <> wrote:

>We had a runaway STCS like this due to our own mistakes but were not sure
>how to clean it up.  We went to LCS instead of STCS and that seemed to
>bring it way back down since the STCS had repeats and such between
>SSTables which LCS avoids mostly.  I can't help much more than that info
>On 3/28/13 12:31 PM, "Ben Chobot" <> wrote:
>>Sorry to make it confusing. I didn't have snapshots on some nodes; I just
>>made a snapshot on a node with this problem.
>>So to be clear, on this one example node....
>> Cassandra reports ~250GB of space used
>> In a CF data directory (before snapshots existed), du -sh showed ~550GB
>> After the snapshot, du in the same directory still showed ~550GB
>>(they're hard links, so that's correct)
>> du in the snapshot directory for that CF shows ~250GB, and ls shows ~50
>>fewer files.
>>On Mar 28, 2013, at 11:10 AM, Hiller, Dean wrote:
>>> I am confused.  I thought you said you don't have a snapshot.  Df/du
>>> reports space used by existing data AND the snapshot.  Cassandra only
>>> reports on space used by actual data........if you move the snapshots,
>>> df/du match what cassandra says?
>>> Dean
>>> On 3/28/13 12:05 PM, "Ben Chobot" <> wrote:
>>>> .....though interestingly, the snapshot of these CFs have the "right"
>>>> amount of data in them (i.e. it agrees with the live SSTable size
>>>> reported by cassandra). Is it total insanity to remove the files from
>>>> data directory not included in the snapshot, so long as they were
>>>> before the snapshot?
>>>> On Mar 28, 2013, at 10:54 AM, Hiller, Dean wrote:
>>>>> Have you cleaned up your snapshotsÅ those take extra space and don't
>>>>> go away unless you delete them.
>>>>> Dean
>>>>> On 3/28/13 11:46 AM, "Ben Chobot" <> wrote:
>>>>>> Are you also running 1.1.5? I'm wondering (ok hoping) that this
>>>>>> be
>>>>>> fixed if I upgrade.
>>>>>> On Mar 28, 2013, at 8:53 AM, Lanny Ripple wrote:
>>>>>>> We occasionally (twice now on a 40 node cluster over the last
>>>>>>> months) see this.  My best guess is that Cassandra can fail to
>>>>>>> SSTable for cleanup somehow.  Forced GC's or reboots don't clear
>>>>>>> out.  We disable thrift and gossip; drain; snapshot; shutdown;
>>>>>>> data/Keyspace/Table/*.db and restore (hard-linking back into
>>>>>>> avoid data transfer) from the just created snapshot; restart.
>>>>>>> On Mar 28, 2013, at 10:12 AM, Ben Chobot <>
>>>>>>> wrote:
>>>>>>>> Some of my cassandra nodes in my 1.1.5 cluster show a large
>>>>>>>> discrepancy between what cassandra says the SSTables should
sum up
>>>>>>>> to,
>>>>>>>> and what df and du claim exist. During repairs, this is almost
>>>>>>>> pretty bad, but post-repair compactions tend to bring those
>>>>>>>> to
>>>>>>>> within a few percent of each other... usually. Sometimes
>>>>>>>> much further apart after compactions have finished - for
>>>>>>>> I'm
>>>>>>>> looking at one node now that claims to have 205GB of SSTables,
>>>>>>>> actually has 450GB of files living in that CF's data directory.
>>>>>>>> pending compactions, and the most recent compaction for this
>>>>>>>> finished just a few hours ago.
>>>>>>>> nodetool cleanup has no effect.
>>>>>>>> What could be causing these extra bytes, and how to get them
to go
>>>>>>>> away? I'm ok with a few extra GB of unexplained data, but
an extra
>>>>>>>> 245GB (more than all the data this node is supposed to have!)
is a
>>>>>>>> little extreme.

View raw message