incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <>
Subject Re: Compaction doubles disk space
Date Wed, 30 Mar 2011 10:39:46 GMT
Checked the code again, got it a bit wrong. When getting a path to flush a memtable (and to
write an incoming stream) via cfs.getFlushPath() the code does not invoke GC if there is not
enough space. 

One reason for not doing this could be that when we do it during compaction we wait for 20
seconds before checking disk space again. However the write happens on a separate flusher

created to ask if we can/should reclaim
space during flush. 

Karl, what version are you using and have you altered the compaction thresholds ? 


On 30 Mar 2011, at 19:46, Karl Hiramoto wrote:

> On 30/03/2011 09:08, aaron morton wrote:
>> Also as far as I understand we cannot immediately delete files because other operations
(including repair) may be using them. The data in the pre compacted files is just as correct
as the data in the compacted file, it's just more compact. So the easiest thing to do is let
the JVM sort out if anything else is using them.
>> Perhaps it could be improved by actively tracking which files are in use so they
may be deleted quicker. But right so long as unused space is freed when needed it's working
as designed AFAIK.
> I've run out of space on multiple occasions, and we have nagios alarms going off frequently
when disk usage is over 90%.   I check cassandra and the data/ directory is 2X  to 4X bigger
than it needs to be, and no compaction or repair is currently running.  I restart the cassandra
process, or force a GC, it deletes a lot of old SSTables and the data/ directory goes down
to 1/2 to 1/4  of the size it was a few minutes ago.
> Under lots of disk pressure here.
> --
> Karl

View raw message