Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AA904D9EF for ; Mon, 3 Sep 2012 05:03:28 +0000 (UTC) Received: (qmail 7183 invoked by uid 500); 3 Sep 2012 05:03:26 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 7091 invoked by uid 500); 3 Sep 2012 05:03:26 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 7083 invoked by uid 99); 3 Sep 2012 05:03:26 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Sep 2012 05:03:26 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of shutyaev@gmail.com designates 209.85.223.172 as permitted sender) Received: from [209.85.223.172] (HELO mail-ie0-f172.google.com) (209.85.223.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Sep 2012 05:03:18 +0000 Received: by ieak13 with SMTP id k13so3650956iea.31 for ; Sun, 02 Sep 2012 22:02:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=hXsX2rPq6XLxXUxNG6ZwGUhukPOIEwNAeN9o7Lls74M=; b=TJZnXIKIwctu+QefXNwnGF5T63irbz1ptQ1ZlwleGAH+5+H8ShN2NnnwQOao3XZgod s0W7viZZphoFQdsFelX6O6dchFGP2s8wCxrcYM6wRnrw6pOJ2nVSBXDVMsd92cgtp9oF HJQl6aPPd+qbQgGL8+Y/ONaBFGU4J1JaBPkoEjH8jS9QtEgqTB/gY4hJNc8gzGV5zUuE 8aZTxf9/GDEfw1fcxwuEeHl+Ip0ITI2+4xjNHj2HgTYsVJWY1bxo0MWs4/A/EbVPMNjm F2fekL+1XeFzmq3NyDleNe8+lInx+dVIKZbTqYFH/iX7iR6jdvaRsuzKqYOPNaP7cgQE FEfQ== MIME-Version: 1.0 Received: by 10.50.220.194 with SMTP id py2mr9703679igc.15.1346648577196; Sun, 02 Sep 2012 22:02:57 -0700 (PDT) Received: by 10.231.58.165 with HTTP; Sun, 2 Sep 2012 22:02:57 -0700 (PDT) In-Reply-To: References: Date: Mon, 3 Sep 2012 09:02:57 +0400 Message-ID: Subject: Re: force gc? From: Alexander Shutyaev To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=bcaec555510014c19e04c8c50e3f --bcaec555510014c19e04c8c50e3f Content-Type: text/plain; charset=UTF-8 Hi Jeffrey, I think I described the problem wrong :) I don't want to do Java's memory GC. I want to do cassandra's GC - that is I want to "really" remove deleted rows from a column family and get my disc space back. 2012/8/31 Jeffrey Kesselman > Cassandra at least used to do disc cleanup as a side effect of > garbage collection through finalizers. (This is a mistake for the > reason outlined below.) > > It is important to understand that you can *never* "force* a gc in java. > Even calling System.gc() is merely a hint to the VM. What you are doing is > telling the VM that you are * willing* to give up some processor time right > now to gc, how much it choses to actually collect or not collect is totally > up to the VM. > > The *only* garbage collection guarantee in java is that it will make a > "best effort" to collect what it can to avoid an out of memory exception at > the time that it runs out of memory. You are not guaranteed when *if > ever*, a given object will actually be collected. Since finalizers happen > when an object is collected, and not when it becomes a candidate for > collection, the same is true of the finalizer. You are > not guaranteed when, if ever, it will run. > > > On Fri, Aug 31, 2012 at 9:03 AM, Alexander Shutyaev wrote: > >> Hi All! >> >> I have a problem with using cassandra. Our application does a lot of >> overwrites and deletes. If I understand correctly cassandra does not >> actually delete these objects until gc_grace seconds have passed. I tried >> to "force" gc by setting gc_grace to 0 on an existing column family and >> running major compaction afterwards. However I did not get disk space back, >> although I'm pretty much sure that my column family should occupy many >> times fewer space. We have also a PostgreSQL db and we duplicate each >> operation with data in both dbs. And the PosgreSQL table is much more >> smaller than the corresponding cassandra's column family. Does anyone have >> any suggestions on how can I analyze my problem? Or maybe I'm doing >> something wrong and there is another way to force gc on an existing column >> family. >> >> Thanks in advance, >> Alexander >> > > > > -- > It's always darkest just before you are eaten by a grue. > --bcaec555510014c19e04c8c50e3f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Jeffrey,

I think I described the problem wrong :) I d= on't want to do Java's memory GC. I want to do cassandra's GC -= that is I want to "really" remove deleted rows from a column fam= ily and get my disc space back.

2012/8/31 Jeffrey Kesselman <jeffpk@gmail.co= m>
Cassandra at least used to do disc cleanup as a side effect of garbage=C2= =A0collection=C2=A0through finalizers. =C2=A0(This is a mistake for the rea= son=C2=A0outlined=C2=A0below.)

It is important to unders= tand that you can *never* "force* a gc in java. Even calling System.gc= () is merely a hint to the VM. What you are=C2=A0doing=C2=A0is telling the = VM that you are *=C2=A0willing* to give up some processor=C2=A0time=C2=A0ri= ght now to gc, how much it choses to actually collect or not collect is tot= ally up to the VM.

The *only* garbage collection=C2=A0guarantee=C2=A0in ja= va is that it will make a "best effort" to collect what it can to= avoid an out of memory exception at the time that it runs out of memory. = =C2=A0You are not=C2=A0guaranteed=C2=A0when *if ever*, a given object will = actually be collected. =C2=A0Since finalizers happen when an=C2=A0object=C2= =A0is=C2=A0collected, and not when it=C2=A0becomes=C2=A0a candidate for col= lection, the same is true of the finalizer. =C2=A0You are not=C2=A0guarante= ed=C2=A0when, if ever, it will run.


On Fri, Aug 31, 2012 at 9:03 AM, Alexander S= hutyaev <shutyaev@gmail.com> wrote:
Hi All!

I have a problem with using cassandra. Our appli= cation does a lot of overwrites and deletes. If I understand correctly cass= andra does not actually delete these objects until gc_grace seconds have pa= ssed. I tried to "force" gc by setting gc_grace to 0 on an existi= ng column family and running major compaction afterwards. However I did not= get disk space back, although I'm pretty much sure that my column fami= ly should occupy many times fewer space. We have also a PostgreSQL db and w= e duplicate each operation with data in both dbs. And the PosgreSQL table i= s much more smaller than the corresponding cassandra's column family. D= oes anyone have any suggestions on how can I analyze my problem? Or maybe I= 'm doing something wrong and there is another way to force gc on an exi= sting column family.

Thanks in advance,
Alexander



--
It's always darkest just= before you are eaten by a grue.

--bcaec555510014c19e04c8c50e3f--