Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 86132 invoked from network); 17 Jan 2011 17:03:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Jan 2011 17:03:48 -0000 Received: (qmail 52684 invoked by uid 500); 17 Jan 2011 17:03:45 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 52507 invoked by uid 500); 17 Jan 2011 17:03:42 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 52498 invoked by uid 99); 17 Jan 2011 17:03:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jan 2011 17:03:41 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dan.hendry.junk@gmail.com designates 209.85.216.172 as permitted sender) Received: from [209.85.216.172] (HELO mail-qy0-f172.google.com) (209.85.216.172) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 17 Jan 2011 17:03:33 +0000 Received: by qyk34 with SMTP id 34so1909629qyk.10 for ; Mon, 17 Jan 2011 09:03:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=jcEQszSv5W58QY+qn5sjFU7iD44JSiqq6F+cf9oK/Mg=; b=FeJVrRG8BcWwDSYfHtjujSGSyQs8BqNdaQl8rkfsL3bWBj2tfX2KIoMQz+CumFf4bX cDa4L8fX/Au6/regtYB4sMufYvlzyBP4yMzCOHaBW269FIM6QMwKzqbPn97eVaaybWDe n0RO0TFgl7ZAEO6x9OThoBLLZCND42ullnUzc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=LOEKBGyHDsoRpWUAL7Fm2GyoU2Cae6jT89eHI92w/i1MDKtij++uPbdWSBCpLy9BIh N9sXyYVjG0nkEcPUfiELo9Qcq8L+FqTZhrxo9/jUablgJd3zmTEa1g2DZwDB24uo0lmZ HH5MlkrkRLg2rGFrrHjZA0y/sjGEL/nKmRjTM= MIME-Version: 1.0 Received: by 10.224.54.10 with SMTP id o10mr4120186qag.104.1295283792401; Mon, 17 Jan 2011 09:03:12 -0800 (PST) Received: by 10.220.191.76 with HTTP; Mon, 17 Jan 2011 09:03:12 -0800 (PST) Date: Mon, 17 Jan 2011 12:03:12 -0500 Message-ID: Subject: Cassandra GC Settings From: Dan Hendry To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=0015175ca85a5456a6049a0dc2df X-Virus-Checked: Checked by ClamAV on apache.org --0015175ca85a5456a6049a0dc2df Content-Type: text/plain; charset=ISO-8859-1 I am having some reliability problems in my Cassandra cluster which I am almost certain is due to GC. I was about to start delving into the guts of the problem by turning on GC logging but I have never done any serious java GC tuning before (time to learn I guess). As a first step however, I was hoping to gain some insight into the GC settings shipped with Cassandra 0.7. I realize its a pretty complicated problem but I was specifically interested in knowing about: -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 Why are these set the way they are? What specifically was used to determine these settings? Was it purely experimental or was there a specific, undesirable behavior adding these settings corrected for? From my various web wanderings, I read the survivor ratio and tenuring threshold settings as "Cassandra creates mostly long lived objects, with objects being promoted very quickly from the young generation to the old generation". Furthermore, the CMSInitiatingOccupancyFraction of 75 (from a JVM default of 68) means "start gc in the old generation later", presumably to allow Cassandra to use more of the old generation heap without needlessly trying to free up used space (?). Please correct me if I am misinterpreting these settings. One of the issues I have been having is extreme node instability when running a major compaction. After 20-30 seconds of operation, the node spends 30+ seconds in (what I believe to be) GC. Now I have tried halving all memtable thresholds to reduce overall heap memory usage but that has not seemed to help with the instability. After one of these blips, I often see log entries as follows: INFO [ScheduledTasks:1] 2011-01-17 10:41:21,961 GCInspector.java (line 133) GC for ParNew: 215 ms, 45084168 reclaimed leaving 11068700368 used; max is 12783583232 INFO [ScheduledTasks:1] 2011-01-17 10:41:28,033 GCInspector.java (line 133) GC for ParNew: 234 ms, 40401120 reclaimed leaving 12144504848 used; max is 12783583232 INFO [ScheduledTasks:1] 2011-01-17 10:42:15,911 GCInspector.java (line 133) GC for ConcurrentMarkSweep: 45828 ms, 3350764696 reclaimed leaving 9224048472 used; max is 12783583232 Given that the 3 GB of garbage collected via ConcurrentMarkSweep was generated in < 30 seconds, one of the first things I was going to try was increasing the survivor ratio (to 16) and increase the MaxTenuringThreshold (to 5) to try and keep more objects in the young generation and therefore cleaned up faster. As a more general approach to solving my problem, I was also going to reduce the CMSInitiatingOccupancyFraction to 65. Does this seem reasonable? Obviously, the best answer is to just try it but I hesitate to start playing with settings when I have only vaguest notions of what they do and little concept of why they are there in the first place. Thanks for any help --0015175ca85a5456a6049a0dc2df Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I am having some reliability problems in my Cassandra cluster which I am al= most certain is due to GC. I was about to start delving into the guts of th= e problem by turning on GC logging but I have never done any serious java G= C tuning before (time to learn I guess).=A0As a first step however, I was h= oping to gain some insight into the GC settings shipped with Cassandra 0.7.= I realize its a pretty complicated problem but I was specifically interest= ed in knowing about:

-XX:SurvivorRatio=3D8
-XX:MaxTenuringThreshol= d=3D1
-XX:CMSInitiatingOccupancyFraction=3D75
Why are these set the way they are? What specifically was used = to determine these settings? Was it purely experimental or was there a spec= ific, undesirable behavior adding these settings corrected for?=A0From my v= arious web wanderings, I read the survivor ratio and tenuring threshold set= tings as "Cassandra creates mostly long lived objects, with objects be= ing promoted very quickly from the young generation to the old generation&q= uot;. Furthermore, the=A0CMSInitiatingOccupancyFraction of 75 (from a JVM d= efault of 68) means "start gc in the old generation later", presu= mably to allow Cassandra to use more of the old generation heap without nee= dlessly trying to free up used space (?). Please correct me if I am=A0misin= terpreting=A0these settings.

One of the issues I have been having is extreme node in= stability when running a major compaction. After 20-30 seconds of operation= , the node spends 30+ seconds in (what I believe to be) GC. Now I have trie= d halving all memtable thresholds to reduce overall heap memory usage but t= hat has not seemed to help with the instability. After one of these blips, = I often see log entries as follows:

=A0INFO [ScheduledTasks:1] 2011-01-17 10:41:21,961= GCInspector.java (line 133) GC for ParNew: 215 ms, 45084168 reclaimed leav= ing 11068700368 used; max is 12783583232
=A0INFO [ScheduledTasks:= 1] 2011-01-17 10:41:28,033 GCInspector.java (line 133) GC for ParNew: 234 m= s, 40401120 reclaimed leaving 12144504848 used; max is 12783583232
=A0INFO [ScheduledTasks:1] 2011-01-17 10:42:15,911 GCInspector.java (l= ine 133) GC for ConcurrentMarkSweep: 45828 ms, 3350764696 reclaimed leaving= 9224048472 used; max is 12783583232

Given t= hat the 3 GB of garbage collected via=A0ConcurrentMarkSweep was generated i= n < 30 seconds, one of the first things I was going to try was increasin= g the survivor ratio (to 16) and increase the=A0MaxTenuringThreshold (to 5)= to try and keep more objects in the young generation and therefore cleaned= up faster. As a more general approach to solving my problem, I was also go= ing to reduce the=A0CMSInitiatingOccupancyFraction to 65. Does this seem re= asonable? Obviously, the best answer is to just try it but I hesitate to st= art playing with settings when I have only vaguest notions of what they do = and little concept of why they are there in the first place.

Thanks for any help
--0015175ca85a5456a6049a0dc2df--