Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 32830 invoked from network); 9 Mar 2011 19:49:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 9 Mar 2011 19:49:21 -0000 Received: (qmail 51602 invoked by uid 500); 9 Mar 2011 19:49:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 51582 invoked by uid 500); 9 Mar 2011 19:49:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 51574 invoked by uid 99); 9 Mar 2011 19:49:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Mar 2011 19:49:18 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [74.125.83.44] (HELO mail-gw0-f44.google.com) (74.125.83.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Mar 2011 19:49:09 +0000 Received: by gwb20 with SMTP id 20so167807gwb.31 for ; Wed, 09 Mar 2011 11:48:48 -0800 (PST) MIME-Version: 1.0 Received: by 10.151.142.3 with SMTP id u3mr186150ybn.214.1299700127979; Wed, 09 Mar 2011 11:48:47 -0800 (PST) Sender: scode@scode.org Received: by 10.151.79.11 with HTTP; Wed, 9 Mar 2011 11:48:47 -0800 (PST) X-Originating-IP: [213.114.156.79] In-Reply-To: References: Date: Wed, 9 Mar 2011 20:48:47 +0100 X-Google-Sender-Auth: UvfuWtHmSKp28KeJPILP_aRd6Is Message-ID: Subject: Re: cassandra and G1 gc From: Peter Schuller To: user@cassandra.apache.org Cc: ruslan usifov Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org > Does anybody use G1 gc in production? What your impressions? I don't know if anyone does, but I've tested it very briefly and at least seen it run well for a while ;) (JDK 1.7 trunk builds) I do have some comments about expected behavior though. But first, for those not familiar with G1, the main motivation for using G1 for Cassandra probably include: (1) By design all collections are compacting, meaning that fragmentation will never become an issue like with CMS old space (2) It's a *lot* easier to tweak than CMS, making deployment easier and configuration less of a hassle for users. Usually you'd specify some pause time goals, and that's it. In the case of Cassandra one would probably still want to force an aggressive trigger for concurrent marking, but I suspect that's about it. (3) As a result of (1) and other properties of G1, it has the potential to completely eliminate even the occasional stop-the-world full GC even after extended runtime. (Keyword being *potential*.) Now, first of all, G1 is still immature compared to CMS. But even if you are in a position that you are willing to trust G1 in some particular JVM version for your production use, and even if G1 actually does work well with Cassandra in terms of workload, there is at least one reason why I would urge caution w.r.t. G1 and Cassandra: The fact that Cassandra uses GC as a means of controlling external resources - in this case, sstables. With CMS, it's "kinda of" okay because unreachable objects will be collected on each run of CMS. So by triggering a full GC when discovering out-of-disk space conditions, Cassandra can kinda avoid the pitfalls it would otherwise entail (though confusion/impracticality for the user remains in that sstables linger for longer than they need to). With G1, it doesn't do a concurrent mark+sweep like CMS. Instead it divides the heap into regions that are individually collected. While there is a concurrent marking process, it is only used to feed data to the policy which decides which regions to collect. There is no guarantee or even expectation that for one "cycle" of concurrent marking, all regions are collection. Individual regions may remain uncollected for extended periods of time or even perpetually. So, while it's iffy of Cassandra to begin with to use the GC for managing external resources (I believe the motivation is less synchronization complexity and/or overhead involved in making the determination as to when an sstable can be deleted), G1 brings it much more into the light than does CMS because one no longer even have the *soft* guarantee that a CMS cycle will allow them to be freed. Now.... in addition, I said G1 had the *potential* to eliminate full GC pauses. I say potential because it's still very possible to have workloads that cause it to effectively fail. In particular, whenever I try to stress it I run into problems where the tracking of inter-pointer references doesn't scale with lots of inter-region writes. The remembered set scanning costs for regions thus go *WAY* up to the point where regions are never collected. Eventually as you rack up more such regions, you end up taking a full GC anyway. Todd Lipcon seemed to have the very same problem when trying to mitigate GC issues with HBase. For more details, there's the "G1GC Full GCs" thread on hotspot-gc-dev/hotspot-gc-use. Unfortunately I can't provide a link because I haven't found an ML archive that properly reconstructs threads for that list... I don't know whether this particular problem would in fact be an issue for Cassandra. Extended long-term testing would probably be required under real workloads of different kinds to determine whether G1 seems suitable in its current condition. -- / Peter Schuller