Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 992C52E8A for ; Sat, 7 May 2011 22:55:20 +0000 (UTC) Received: (qmail 81644 invoked by uid 500); 7 May 2011 22:55:18 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 81623 invoked by uid 500); 7 May 2011 22:55:18 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 81615 invoked by uid 99); 7 May 2011 22:55:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 May 2011 22:55:18 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [216.32.180.16] (HELO VA3EHSOBE006.bigfish.com) (216.32.180.16) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 May 2011 22:55:11 +0000 Received: from mail128-va3-R.bigfish.com (10.7.14.242) by VA3EHSOBE006.bigfish.com (10.7.40.26) with Microsoft SMTP Server id 14.1.225.8; Sat, 7 May 2011 22:54:49 +0000 Received: from mail128-va3 (localhost.localdomain [127.0.0.1]) by mail128-va3-R.bigfish.com (Postfix) with ESMTP id 682381C70431 for ; Sat, 7 May 2011 22:54:49 +0000 (UTC) X-SpamScore: -22 X-BigFish: VPS-22(zzbb2cK936eK9371O1432N98dKzz1202hzz8275bhz2dh2a8h668h839h61h) X-Spam-TCS-SCL: 0:0 X-Forefront-Antispam-Report: KIP:(null);UIP:(null);IPVD:NLI;H:AAPQHTCAS04.proque.st;RD:none;EFVD:NLI Received: from mail128-va3 (localhost.localdomain [127.0.0.1]) by mail128-va3 (MessageSwitch) id 1304808889197605_19119; Sat, 7 May 2011 22:54:49 +0000 (UTC) Received: from VA3EHSMHS018.bigfish.com (unknown [10.7.14.248]) by mail128-va3.bigfish.com (Postfix) with ESMTP id 23283998057 for ; Sat, 7 May 2011 22:54:49 +0000 (UTC) Received: from AAPQHTCAS04.proque.st (165.215.94.2) by VA3EHSMHS018.bigfish.com (10.7.99.28) with Microsoft SMTP Server (TLS) id 14.1.225.8; Sat, 7 May 2011 22:54:49 +0000 Received: from AAPQMAILBX02V.proque.st ([fe80::a5a3:d005:a27c:9314]) by AAPQHTCAS04.proque.st ([fe80::6883:b760:3780:b2db%10]) with mapi; Sat, 7 May 2011 18:54:48 -0400 From: "Serediuk, Adam" To: "user@cassandra.apache.org" CC: "user@cassandra.apache.org" Date: Sat, 7 May 2011 18:54:44 -0400 Subject: Re: Memory Usage During Read Thread-Topic: Memory Usage During Read Thread-Index: AcwNCcOQCGZHMDQhRgSAalr9R0dIRA== Message-ID: <2CDF4665-8640-4050-87BD-0815E11FD23E@serialssolutions.com> References: <564509F2-34C9-428C-901F-1A798803745A@serialssolutions.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: serialssolutions.com X-Virus-Checked: Checked by ClamAV on apache.org How much memory should a single hot cf with a 128mb memtable take with row = and key caching disabled during read? Because I'm seeing heap go from 3.5gb skyrocketing straight to max (regardl= ess of the size, 8gb and 24gb both do the same) at which time the jvm will = do nothing but full gc and is unable to reclaim any meaningful amount of me= mory. Cassandra then becomes unusable.=20 I see the same behavior with smaller memtables, eg 64mb.=20 This happens well into the read operation an only on a small number of node= s in the cluster(1-4 out of a total of 60 nodes.) Sent from my iPhone On May 6, 2011, at 22:45, "Jonathan Ellis" wrote: > You don't GC storm without legitimately having a too-full heap. It's > normal to see occasional full GCs from fragmentation, but that will > actually compact the heap and everything goes back to normal IF you > had space actually freed up. >=20 > You say you've played w/ memtable size but that would still be my bet. > Most people severely underestimate how much space this takes (10x in > memory over serialized size), which will bite you when you have lots > of CFs defined. >=20 > Otherwise, force a heap dump after a full GC and take a look to see > what's referencing all the memory. >=20 > On Fri, May 6, 2011 at 12:25 PM, Serediuk, Adam > wrote: >> We're troubleshooting a memory usage problem during batch reads. We've s= pent the last few days profiling and trying different GC settings. The symp= toms are that after a certain amount of time during reads one or more nodes= in the cluster will exhibit extreme memory pressure followed by a gc storm= . We've tried every possible JVM setting and different GC methods and the i= ssue persists. This is pointing towards something instantiating a lot of ob= jects and keeping references so that they can't be cleaned up. >>=20 >> Typically nothing is ever logged other than the GC failures however just= now one of the nodes emitted logs we've never seen before: >>=20 >> INFO [ScheduledTasks:1] 2011-05-06 15:04:55,085 StorageService.java (li= ne 2218) Unable to reduce heap usage since there are no dirty column famili= es >>=20 >> We have tried increasing the heap on these nodes to large values, eg 24G= B and still run into the same issue. We're running 8GB of heap normally and= only one or two nodes will ever exhibit this issue, randomly. We don't use= key/row caching and our memtable sizing is 64mb/0.3. Larger or smaller mem= tables make no difference in avoiding the issue. We're on 0.7.5, mmap, jna = and jdk 1.6.0_24 >>=20 >> We've somewhat hit the wall in troubleshooting and any advice is greatly= appreciated. >>=20 >> -- >> Adam >>=20 >=20 >=20 >=20 > --=20 > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >=20