incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Sudden increase in diskspace usage
Date Tue, 14 May 2013 04:50:37 GMT
> Let's say we're seing some bug in C*, and SSTables doesn't get deleted during compaction
(which I guess is the only reason for this consumption of diskspace). 

Just out of interest can you check the number of SSTables reported by nodetool cfstats for
a CF against the number of *-Data.db files in the appropriate directory on disk?
Another test is to take a snapshot and see if there are files in the live directory not in
the snapshot dir. 

Either of these techniques may identify SSTables on disk that the server is not tracking.


Cheers
 
-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 10/05/2013, at 8:33 PM, Nicolai Gylling <ng@issuu.com> wrote:

>> On Wed, May 8, 2013 at 10:43 PM, Nicolai Gylling <ng@issuu.com> wrote:
>>> At the time of normal operation there was 800 gb free space on each node.
>>> After the crash, C* started using a lot more, resulting in an
>>> out-of-diskspace situation on 2 nodes, eg. C* used up the 800 gb in just 2
>>> days, giving us very little time to do anything about it, since
>>> repairs/joins takes a considerable amount of time.
>> 
>> Did someone do a repair? Repair very frequently results in (usually
>> temporary) >2x disk consumption.
>> 
> Repairs is running regularly once a week, and normally doesn't take up much space, as
we're using Leveled Compaction Strategy. 
> 
> 
>>> What can make C* suddenly use this amount of disk-space? We did see a lot of
>>> pending compactions on one node (7k).
>> 
>> Mostly repair.
>> 
>>> Any tips on recovering from an out-of-diskspace on multiple nodes,
>>> situation? I've tried moving some SStables away, but C* seems to use
>>> whatever space I free up in no time. I'm not sure if any of the nodes is
>>> fully updated as 'nodetool status' reports 3 different loads
>> 
>> A relevant note here is that moving sstables out of the full partition
>> while cassandra is running will not result in any space recovery,
>> because Cassandra still has an open filehandle to that sstable. In
>> order to deal with out of disk space condition you need to stop
>> Cassandra. Unfortunately the JVM stops responding to clean shutdown
>> request when the disk is full, you will have to kill -KILL the
>> process.
>> 
>> If you have a lot of overwrites/fragmentation, you could attempt to
>> clear enough space to do a major compaction of remaining data, do that
>> major compaction, split your One Huge sstable with the (experimental)
>> sstable_split tool and then copy temporarily moved sstables back onto
>> the node. You could also attempt to use user defined compaction (via
>> JMX endpoint) to strategically compact such data. If you grep for
>> compaction in your logs, do you see compactions resulting in smaller
>> output file sizes? (compacted to X% of original messages)
>> 
>> I agree with Alexis Rodriguez that Cassandra 1.2.0 is not a version
>> anyone should run, it contains significant bugs.
>> 
>> =Rob
> 
> We're storing timeseries, so we don't have any overwrites and hardly any reduction in
sizes during compaction. I'll try to upgrade and see if that can help get some diskspace back.
> 
> Let's say we're seing some bug in C*, and SSTables doesn't get deleted during compaction
(which I guess is the only reason for this consumption of diskspace). Will C* 1.2.4 be able
to fix this? Or would it be a better solution to replace one node at a time, so we're sure
to only have the data, that C* knows about?
> 
> 


Mime
View raw message