From Jonathan Ellis <>
Subject Re: cleaning house
Date Tue, 20 Apr 2010 17:51:00 GMT
SSTables that are obsoleted by a compaction are deleted asynchronously
when the JVM performs a GC.  You can force a GC from jconsole if
necessary but this is not necessary; Cassandra will force one itself
if it detects that it is low on space.  A compaction marker is also
added to obsolete sstables so they can be deleted on startup if the
server does not perform a GC before being restarted.

CFStoreMBean exposes sstable space used as getLiveDiskSpaceUsed (only
includes size of non-obsolete files) and getLiveDiskSpaceUsed
(includes everything).

On Tue, Apr 20, 2010 at 12:33 PM, B. Todd Burruss <> wrote:
> i'm trying to draw some correlation between the size of my data and the
> space used on disk.  i have set <GCGraceSeconds>1</GCGraceSeconds> so there
> isn't any reason to keep data around.
> my approach is this:
> after only doing "puts" to cassandra for a while i stop my client and want
> to perform the proper "cleanup" and/or "compact" operations that will reduce
> the disk space used to a minimum.  however i can't seem to figure it out.
>  i've done "major compaction", "cleanup", etc. but doesn't seem to get the
> job done
> so two questions
> - what procedure is suggested to get rid of all unnecessary data?
> - and what does the following "Compacted" file mean?  seams like it is
> marking "88" as compacted, but there are no more compactions happening
> according to compaction mgr
> -rw-rw-r-- 1 bburruss bburruss          0 Apr 20 08:32 bucket-88-Compacted
> -rw-rw-r-- 1 bburruss bburruss 1445218042 Apr 19 21:39 bucket-88-Data.db
> -rw-rw-r-- 1 bburruss bburruss   12255925 Apr 19 21:39 bucket-88-Filter.db
> -rw-rw-r-- 1 bburruss bburruss  451806386 Apr 19 21:39 bucket-88-Index.db

