Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of shutyaev@gmail.com designates
 209.85.223.172 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CACU4Y-ihnbHeoPjit-qiHyuXE0YjgOuQ7nX4NyCrQS_5Rjdd0Q@mail.gmail.com>
References: 
 <CAGBp8g_qF2V_MWLpFb7UTsYG=GMWApHoKSxUE27+s2W0Obdi-Q@mail.gmail.com>
	<CACU4Y-ihnbHeoPjit-qiHyuXE0YjgOuQ7nX4NyCrQS_5Rjdd0Q@mail.gmail.com>
Date: Mon, 3 Sep 2012 09:02:57 +0400
Message-ID: 
 <CAGBp8g_pPV5YRDXttGzOvRhXe-dpzgArSeiU6kSWNSwhhJOx_Q@mail.gmail.com>
Subject: Re: force gc?
From: Alexander Shutyaev <shutyaev@gmail.com>
To: user@cassandra.apache.org
Content-Type: multipart/alternative; boundary=bcaec555510014c19e04c8c50e3f

--bcaec555510014c19e04c8c50e3f
Content-Type: text/plain; charset=UTF-8

Hi Jeffrey,

I think I described the problem wrong :) I don't want to do Java's memory
GC. I want to do cassandra's GC - that is I want to "really" remove deleted
rows from a column family and get my disc space back.

2012/8/31 Jeffrey Kesselman <jeffpk@gmail.com>

> Cassandra at least used to do disc cleanup as a side effect of
> garbage collection through finalizers.  (This is a mistake for the
> reason outlined below.)
>
> It is important to understand that you can *never* "force* a gc in java.
> Even calling System.gc() is merely a hint to the VM. What you are doing is
> telling the VM that you are * willing* to give up some processor time right
> now to gc, how much it choses to actually collect or not collect is totally
> up to the VM.
>
> The *only* garbage collection guarantee in java is that it will make a
> "best effort" to collect what it can to avoid an out of memory exception at
> the time that it runs out of memory.  You are not guaranteed when *if
> ever*, a given object will actually be collected.  Since finalizers happen
> when an object is collected, and not when it becomes a candidate for
> collection, the same is true of the finalizer.  You are
> not guaranteed when, if ever, it will run.
>
>
> On Fri, Aug 31, 2012 at 9:03 AM, Alexander Shutyaev <shutyaev@gmail.com>wrote:
>
>> Hi All!
>>
>> I have a problem with using cassandra. Our application does a lot of
>> overwrites and deletes. If I understand correctly cassandra does not
>> actually delete these objects until gc_grace seconds have passed. I tried
>> to "force" gc by setting gc_grace to 0 on an existing column family and
>> running major compaction afterwards. However I did not get disk space back,
>> although I'm pretty much sure that my column family should occupy many
>> times fewer space. We have also a PostgreSQL db and we duplicate each
>> operation with data in both dbs. And the PosgreSQL table is much more
>> smaller than the corresponding cassandra's column family. Does anyone have
>> any suggestions on how can I analyze my problem? Or maybe I'm doing
>> something wrong and there is another way to force gc on an existing column
>> family.
>>
>> Thanks in advance,
>> Alexander
>>
>
>
>
> --
> It's always darkest just before you are eaten by a grue.
>

--bcaec555510014c19e04c8c50e3f
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

Hi Jeffrey,<div><br></div><div>I think I described the problem wrong :) I d=
on&#39;t want to do Java&#39;s memory GC. I want to do cassandra&#39;s GC -=
 that is I want to &quot;really&quot; remove deleted rows from a column fam=
ily and get my disc space back.<br>
<br><div class=3D"gmail_quote">2012/8/31 Jeffrey Kesselman <span dir=3D"ltr=
">&lt;<a href=3D"mailto:jeffpk@gmail.com" target=3D"_blank">jeffpk@gmail.co=
m</a>&gt;</span><br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0=
 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Cassandra at least used to do disc cleanup as a side effect of garbage=C2=
=A0collection=C2=A0through finalizers. =C2=A0(This is a mistake for the rea=
son=C2=A0outlined=C2=A0below.)<div><br></div><div>It is important to unders=
tand that you can *never* &quot;force* a gc in java. Even calling System.gc=
() is merely a hint to the VM. What you are=C2=A0doing=C2=A0is telling the =
VM that you are *=C2=A0willing* to give up some processor=C2=A0time=C2=A0ri=
ght now to gc, how much it choses to actually collect or not collect is tot=
ally up to the VM.</div>


<div><br></div><div>The *only* garbage collection=C2=A0guarantee=C2=A0in ja=
va is that it will make a &quot;best effort&quot; to collect what it can to=
 avoid an out of memory exception at the time that it runs out of memory. =
=C2=A0You are not=C2=A0guaranteed=C2=A0when *if ever*, a given object will =
actually be collected. =C2=A0Since finalizers happen when an=C2=A0object=C2=
=A0is=C2=A0collected, and not when it=C2=A0becomes=C2=A0a candidate for col=
lection, the same is true of the finalizer. =C2=A0You are not=C2=A0guarante=
ed=C2=A0when, if ever, it will run.<div>
<div class=3D"h5"><br>

<br><div class=3D"gmail_quote">On Fri, Aug 31, 2012 at 9:03 AM, Alexander S=
hutyaev <span dir=3D"ltr">&lt;<a href=3D"mailto:shutyaev@gmail.com" target=
=3D"_blank">shutyaev@gmail.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">


Hi All!<div><br></div><div>I have a problem with using cassandra. Our appli=
cation does a lot of overwrites and deletes. If I understand correctly cass=
andra does not actually delete these objects until gc_grace seconds have pa=
ssed. I tried to &quot;force&quot; gc by setting gc_grace to 0 on an existi=
ng column family and running major compaction afterwards. However I did not=
 get disk space back, although I&#39;m pretty much sure that my column fami=
ly should occupy many times fewer space. We have also a PostgreSQL db and w=
e duplicate each operation with data in both dbs. And the PosgreSQL table i=
s much more smaller than the corresponding cassandra&#39;s column family. D=
oes anyone have any suggestions on how can I analyze my problem? Or maybe I=
&#39;m doing something wrong and there is another way to force gc on an exi=
sting column family.</div>


<div><br></div><div>Thanks in advance,</div><div>Alexander</div>
</blockquote></div><br><br clear=3D"all"><div><br></div></div></div><span c=
lass=3D"HOEnZb"><font color=3D"#888888">-- <br>It&#39;s always darkest just=
 before you are eaten by a grue.<br>
</font></span></div>
</blockquote></div><br></div>

--bcaec555510014c19e04c8c50e3f--