Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AECB810725 for ; Sun, 16 Jun 2013 00:04:00 +0000 (UTC) Received: (qmail 15354 invoked by uid 500); 16 Jun 2013 00:03:58 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 15335 invoked by uid 500); 16 Jun 2013 00:03:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 15326 invoked by uid 99); 16 Jun 2013 00:03:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Jun 2013 00:03:58 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of tsato@cloudian.com designates 209.85.223.177 as permitted sender) Received: from [209.85.223.177] (HELO mail-ie0-f177.google.com) (209.85.223.177) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 16 Jun 2013 00:03:52 +0000 Received: by mail-ie0-f177.google.com with SMTP id aq17so4236823iec.36 for ; Sat, 15 Jun 2013 17:03:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=zMLdniu894QbBGu1rJjs2/yGg5+baozmvnP8wlOJSws=; b=bmWeSL63EVhop+1ofn0jtoQ5ECVEUpo9dvbcKtgkqhSVrFeDzx/kTbpdrIvzJjqy7Y mNi+3sWPjpMMSsFBUHThsejs+SGOekHODIKpWZ1J9AEKn6z/3S2IpdmewAY83jrjTDGa A4X/O9vZVFXMKB+5YUq5zOIp8Lm+XoDcaeSQ3p3hw5cHI8xniKyHkZ+k/uyck56BmIsj aLd7yqrQEMtKhxivru9A+Y5uAJahTwgGLTb2kj0spxs5fJK4iAHusqZo+l9/J9SmV4/3 PhnyMeTwuqudJgeYrK6Ev3kbofHr8ay9iB7SYr0QPCaH/LBmHw4IAuEBLegqo4aCNjLE Qysg== MIME-Version: 1.0 X-Received: by 10.50.92.70 with SMTP id ck6mr1888662igb.76.1371341011456; Sat, 15 Jun 2013 17:03:31 -0700 (PDT) Received: by 10.64.68.198 with HTTP; Sat, 15 Jun 2013 17:03:31 -0700 (PDT) In-Reply-To: <6D3C4DCA-8CD6-4D3E-915D-66CFF8323F13@gmail.com> References: <2C85E14562B39345BCCAD90B8E7955C929BE4B@DKEXC002.adform.com> <2C85E14562B39345BCCAD90B8E7955C929C241@DKEXC002.adform.com> <51B237BA.9060809@4friends.od.ua> <6D3C4DCA-8CD6-4D3E-915D-66CFF8323F13@gmail.com> Date: Sun, 16 Jun 2013 09:03:31 +0900 Message-ID: Subject: Re: Reduce Cassandra GC From: Takenori Sato To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b10ce13daaa7504df3a35ac X-Gm-Message-State: ALoCoQnaDuNUXK00vwJEF/N7aw0QWUBAOiFlUSx8zbAfnTrmpQIC98nKtXd1SRFDoom9/EijMXv9 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b10ce13daaa7504df3a35ac Content-Type: text/plain; charset=ISO-8859-1 Uncomment the followings in "cassandra-env.sh". JVM_OPTS="$JVM_OPTS -XX:+PrintGCDateStamps" JVM_OPTS="$JVM_OPTS -XX:+PrintPromotionFailure" JVM_OPTS="$JVM_OPTS -Xloggc:/var/log/cassandra/gc-`date +%s`.log" *> *Also can you take a heap dump at 2 diff points so that we can compare it? No, I'm afraid. I ordinary use profiling tools, but am not aware of anything that could respond during this event. On Sun, Jun 16, 2013 at 4:44 AM, Mohit Anchlia wrote: > Can you paste you gc config? Also can you take a heap dump at 2 diff > points so that we can compare it? > > Quick thing to do would be to do a histo live at 2 points and compare > > Sent from my iPhone > > On Jun 15, 2013, at 6:57 AM, Takenori Sato wrote: > > > INFO [ScheduledTasks:1] 2013-04-15 14:00:02,749 GCInspector.java (line > 122) GC for ParNew: 338798 ms for 1 collections, 592212416 used; max is > 1046937600 > > This says GC for New Generation took so long. And this is usually > unlikely. > > The only situation I am aware of is when a fairly large object is created, > and which can not be promoted to Old Generation because it requires such a > large *contiguous* memory space that is unavailable at the point in time. > This is called promotion failure. So it has to wait until concurrent > collector collects a large enough space. Thus you experience stop the > world. But I think it is not stop the world, but only stop the new world. > > For example in case of Cassandra, a large number of > in_memory_compaction_limit_in_mb can cause this. This is a limit when a > compaction compacts(merges) rows of a key into the latest in memory. So > this creates a large byte array up to the number. > > You can confirm this by enabling promotion failure GC logging in the > future, and by checking compactions executed at that point in time. > > > > On Sat, Jun 15, 2013 at 10:01 AM, Robert Coli wrote: > >> On Fri, Jun 7, 2013 at 12:42 PM, Igor wrote: >> > If you are talking about 1.2.x then I also have memory problems on the >> idle >> > cluster: java memory constantly slow grows up to limit, then spend long >> time >> > for GC. I never seen such behaviour for 1.0.x and 1.1.x, where on idle >> > cluster java memory stay on the same value. >> >> If you are not aware of a pre-existing JIRA, I strongly encourage you to : >> >> 1) Document your experience of this. >> 2) Search issues.apache.org for anything that sounds similar. >> 3) If you are unable to find a JIRA, file one. >> >> Thanks! >> >> =Rob >> > > --047d7b10ce13daaa7504df3a35ac Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Uncomment the followings in "cassandra-env.sh".<= div>
JVM_OPTS=3D"$JVM_OPTS -XX:+Print= GCDateStamps"
JVM_OPTS=3D"$JVM_OPTS -XX:+PrintPromotionFa= ilure"
JVM_OPTS=3D"$JVM_OPTS -Xloggc:/var/log/cass= andra/gc-`date +%s`.log"

&= gt; Also can you take a heap dump at 2 di= ff points so that we can compare it?

No, I'm afraid. I ordinary use profiling tools, but = am not aware of anything that could respond during this event.


On Sun, Jun 16, 2013 at 4:44 AM, Mohit Anchlia <<= a href=3D"mailto:mohitanchlia@gmail.com" target=3D"_blank">mohitanchlia@gma= il.com> wrote:
Can you paste you gc = config? Also can you take a heap dump at 2 diff points so that we can compa= re it?

Quick thing to do would be to do a histo live at 2 poin= ts and compare

Sent from my iPhone

On Jun 15, 2013, at 6:57 AM, Takenori Sato <tsato@cloudian.com> wrote:
>=A0INFO [ScheduledTasks:1] 2= 013-04-15 14:00:02,749 GCInspector.java (line 122) GC for ParNew: 338798 ms= for 1 collections, 592212416 used; max is 1046937600

This say= s GC for New Generation took so long. And this is usually unlikely.=A0

The= only situation I am aware of is when a fairly large object is created, and= which can not be promoted to Old Generation because it requires such a lar= ge *contiguous* memory space that is unavailable at the point in time. This= is called promotion failure. So it has to wait until concurrent collector = collects a large enough space. Thus you experience stop the world. But I th= ink it is not stop the world, but only stop the new world.

For= example in case of Cassandra, a large number of in_memory_compaction_limit= _in_mb can cause this. This is a limit when a compaction compacts(merges) r= ows of a key into the latest in memory. So this creates a large byte array = up to the number.

You= can confirm this by enabling promotion failure GC logging in the future, a= nd by checking compactions executed at that point in time.



= On Sat, Jun 15, 2013 at 10:01 AM, Robert Coli <rcoli@eventbrite.com= > wrote:
On Fri, Jun 7, 2013 at 12:42 PM, Igor &= lt;igor@4friends.o= d.ua> wrote:
> If you are talking about 1.2.x then I also have memory problems on the= idle
> cluster: java memory constantly slow grows up to limit, then spend lon= g time
> for GC. I never seen such behaviour for 1.0.x and 1.1.x, where on idle=
> cluster java memory stay on the same value.

If you are not aware of a pre-existing JIRA, I strongly encourage you= to :

1) Document your experience of this.
2) Search issues.apa= che.org for anything that sounds similar.
3) If you are unable to find a JIRA, file one.

Thanks!

=3DRob


--047d7b10ce13daaa7504df3a35ac--