Hi Jeffrey,

I think I described the problem wrong :) I don't want to do Java's memory GC. I want to do cassandra's GC - that is I want to "really" remove deleted rows from a column family and get my disc space back.

2012/8/31 Jeffrey Kesselman <jeffpk@gmail.com>
Cassandra at least used to do disc cleanup as a side effect of garbage collection through finalizers.  (This is a mistake for the reason outlined below.)

It is important to understand that you can *never* "force* a gc in java. Even calling System.gc() is merely a hint to the VM. What you are doing is telling the VM that you are * willing* to give up some processor time right now to gc, how much it choses to actually collect or not collect is totally up to the VM.

The *only* garbage collection guarantee in java is that it will make a "best effort" to collect what it can to avoid an out of memory exception at the time that it runs out of memory.  You are not guaranteed when *if ever*, a given object will actually be collected.  Since finalizers happen when an object is collected, and not when it becomes a candidate for collection, the same is true of the finalizer.  You are not guaranteed when, if ever, it will run.

On Fri, Aug 31, 2012 at 9:03 AM, Alexander Shutyaev <shutyaev@gmail.com> wrote:
Hi All!

I have a problem with using cassandra. Our application does a lot of overwrites and deletes. If I understand correctly cassandra does not actually delete these objects until gc_grace seconds have passed. I tried to "force" gc by setting gc_grace to 0 on an existing column family and running major compaction afterwards. However I did not get disk space back, although I'm pretty much sure that my column family should occupy many times fewer space. We have also a PostgreSQL db and we duplicate each operation with data in both dbs. And the PosgreSQL table is much more smaller than the corresponding cassandra's column family. Does anyone have any suggestions on how can I analyze my problem? Or maybe I'm doing something wrong and there is another way to force gc on an existing column family.

Thanks in advance,

It's always darkest just before you are eaten by a grue.