Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 20EA94C09 for ; Fri, 27 May 2011 02:28:03 +0000 (UTC) Received: (qmail 79731 invoked by uid 500); 27 May 2011 02:28:01 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 79638 invoked by uid 500); 27 May 2011 02:28:01 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 79630 invoked by uid 99); 27 May 2011 02:28:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 May 2011 02:28:01 +0000 X-ASF-Spam-Status: No, hits=1.3 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jeffpk@gmail.com designates 209.85.160.44 as permitted sender) Received: from [209.85.160.44] (HELO mail-pw0-f44.google.com) (209.85.160.44) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 27 May 2011 02:27:56 +0000 Received: by pwi5 with SMTP id 5so669217pwi.31 for ; Thu, 26 May 2011 19:27:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type:content-transfer-encoding; bh=EIcfmNAziBvpbsOP799da+OOdVqrJMkJ2GOTn1lwtfo=; b=ncO1MG+4rQlcVaSLdar8biux+MLBOA+qJr/Hx+z+l6+qmouXYKFanvXdbM4599N0HY 9aHZPHent/IMc3IbGxhTVlbC0ysyB82yvxa1mt13LG4vfXyhwlPK9xJaOVeeqNCYqKEq ntuLu/gEovDGJH/A3U6P9n1bdaSQSrZfMzlek= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=q4vbQ2iJHbWKL4evtU5tyOBHLWz5ENNYxHxXTI4vhQpTDpRNwNMse/gOfCOPkHpmMP R2xkspAjn5zhZbZshSAZMw7WdUau8/oqZNmwPshWRWGjaNXUa/jhNuqBABkdsEODahKz eLhyZ5ON77hurC65Wo0yuhPZFklwDVdrR72Wk= MIME-Version: 1.0 Received: by 10.68.22.35 with SMTP id a3mr614905pbf.477.1306463256611; Thu, 26 May 2011 19:27:36 -0700 (PDT) Received: by 10.68.54.195 with HTTP; Thu, 26 May 2011 19:27:36 -0700 (PDT) In-Reply-To: References: <2121177157.226992.1306450684707.JavaMail.root@mail-1.01.com> Date: Thu, 26 May 2011 22:27:36 -0400 Message-ID: Subject: Re: Forcing Cassandra to free up some space From: Jeffrey Kesselman To: user@cassandra.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Im also not sure that will guarantee all space is cleaned up. It really depends on what you are doing inside Cassandra. If you have your on garbage collect that is just in some way tied to the gc run, then it will run when it runs. If otoh you are associating records in your storage with specific objects in memory and using one of the post-mortem hooks (finalize or PhantomReference) to tell you to clean up that particular record then its quite possible they wont all get cleaned up. In general hotspot does not find and clean every candidate object on every GC run. It starts with the easiest/fastest to find and then sees what more it thinks it needs to do to create enough memory for anticipated near future needs. On Thu, May 26, 2011 at 10:16 PM, Jonathan Ellis wrote: > In summary, system.gc works fine unless you've deliberately done > something like setting the -XX:-DisableExplicitGC flag. > > On Thu, May 26, 2011 at 5:58 PM, Konstantin =A0Naryshkin > wrote: >> So, in summary, there is no way to predictably and efficiently tell Cass= andra to get rid of all of the extra space it is using on disk? >> >> ----- Original Message ----- >> From: "Jeffrey Kesselman" >> To: user@cassandra.apache.org >> Sent: Thursday, May 26, 2011 8:57:49 PM >> Subject: Re: Forcing Cassandra to free up some space >> >> Which JVM? =A0Which collector? =A0There have been and continue to be man= y. >> >> Hotspot itself supports a number of different collectors with >> different behaviors. =A0 Many of them do not collect every candidate on >> every gc, but merely the easiest ones to find. =A0This is why depending >> on finalizers is a *bad* idea in java code. =A0They may well never get >> run. =A0(Finalizer is one of a few features the Sun Java team always >> regretted putting in Java to start with. =A0It has caused quite a few >> application problems over the years) >> >> The really important thing is that NONE of these behaviors of the >> colelctors are guaranteed by specification not to change from version >> to version. =A0Basing your code on non-specified behaviors is a good way >> to hit mysterious failures on updates. >> >> For instance, in the mid 90s, IBM had a mode of their Vm called >> "infinite heap." =A0it *never* garbage collected, even if you called >> System.gc. =A0Instead it just threw away address space and counted on >> the total memory needs for the life of the program being less then the >> total addressable space of the processor. >> >> It was *very* fast for certain kinds of applications. >> >> Far from being pedantic, not depending on undocumented behavior is >> simply good engineering. >> >> >> On Thu, May 26, 2011 at 4:51 PM, Jonathan Ellis wrot= e: >>> I've read the relevant source. While you're pedantically correct re >>> the spec, you're wrong as to what the JVM actually does. >>> >>> On Thu, May 26, 2011 at 3:14 PM, Jeffrey Kesselman w= rote: >>>> Some references... >>>> >>>> "An object enters an unreachable state when no more strong references >>>> to it exist. When an object is unreachable, it is a candidate for >>>> collection. Note the wording: Just because an object is a candidate >>>> for collection doesn't mean it will be immediately collected. The JVM >>>> is free to delay collection until there is an immediate need for the >>>> memory being consumed by the object." >>>> >>>> http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm= .html#998394 >>>> >>>> and "Calling the gc method suggests that the Java Virtual Machine >>>> expend effort toward recycling unused objects" >>>> >>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#gc(= ) >>>> >>>> It goes on to say that the VM will make a "best effort", but "best >>>> effort" is *deliberately* left up to the definition of the gc >>>> implementor. >>>> >>>> I guess you missed the many lectures I have given on this subject over >>>> the years at Java One Conferences.... >>>> >>>> On Thu, May 26, 2011 at 3:53 PM, Jonathan Ellis wr= ote: >>>>> It's a common misunderstanding that system.gc is only a suggestion; o= n >>>>> any VM you're likely to run Cassandra on, System.gc will actually >>>>> invoke a full collection. >>>>> >>>>> On Thu, May 26, 2011 at 2:18 PM, Jeffrey Kesselman = wrote: >>>>>> Actually this is no gaurantee.=A0=A0 Its a common misunderstanding t= hat >>>>>> System.gc "forces" gc.=A0 It does not. It is a suggestion only. The = vm always >>>>>> has the option as to when and how much it gcs >>>>>> >>>>>> On May 26, 2011 2:51 PM, "Jonathan Ellis" wrote: >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Jonathan Ellis >>>>> Project Chair, Apache Cassandra >>>>> co-founder of DataStax, the source for professional Cassandra support >>>>> http://www.datastax.com >>>>> >>>> >>>> >>>> >>>> -- >>>> It's always darkest just before you are eaten by a grue. >>>> >>> >>> >>> >>> -- >>> Jonathan Ellis >>> Project Chair, Apache Cassandra >>> co-founder of DataStax, the source for professional Cassandra support >>> http://www.datastax.com >>> >> >> >> >> -- >> It's always darkest just before you are eaten by a grue. >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com > --=20 It's always darkest just before you are eaten by a grue.