Return-Path: Delivered-To: apmail-incubator-cassandra-user-archive@minotaur.apache.org Received: (qmail 54822 invoked from network); 16 Dec 2009 23:30:35 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Dec 2009 23:30:35 -0000 Received: (qmail 15206 invoked by uid 500); 16 Dec 2009 23:30:34 -0000 Delivered-To: apmail-incubator-cassandra-user-archive@incubator.apache.org Received: (qmail 15172 invoked by uid 500); 16 Dec 2009 23:30:34 -0000 Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: cassandra-user@incubator.apache.org Delivered-To: mailing list cassandra-user@incubator.apache.org Received: (qmail 15163 invoked by uid 99); 16 Dec 2009 23:30:34 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Dec 2009 23:30:34 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates 209.85.219.220 as permitted sender) Received: from [209.85.219.220] (HELO mail-ew0-f220.google.com) (209.85.219.220) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Dec 2009 23:30:32 +0000 Received: by ewy20 with SMTP id 20so1692813ewy.20 for ; Wed, 16 Dec 2009 15:30:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :from:date:message-id:subject:to:content-type :content-transfer-encoding; bh=UcimTNA6A3mP3sgMF+5iO7MnTlVc+aFbbvk/AqLOcD4=; b=EXiV517M7pEAdm9FNL2DqCD3MagzuJK/AZlcp2Rr3GK79UmjQvdYHuYdmlCD1o1GJM lZdtBsVc/aFdnAk4VpnNVE+ZXc8WYWap4rA+cEss/0bcL1rMQP9Cdw69sIdOkb8q6xTC VTYbIAEbb3wxxLwMfSKNLgjscGPMMztundVYw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:content-transfer-encoding; b=o10kzenH4vJprxXrNSIa2sstJBc7LictfLPkpbLBql/4Qg67Bel50xUJMMJjo+HLx4 eAb4Q2dOL4xiLx+iXABFKAC2RQ+OFByva7E3SaV1nW/AgpRw2YV0O8oILRIx93ZX9v+4 SOGXFb1Kg5Nl3N795sxu4Ca5XG4PxFD6HYFUo= MIME-Version: 1.0 Received: by 10.216.91.18 with SMTP id g18mr609152wef.124.1261006211128; Wed, 16 Dec 2009 15:30:11 -0800 (PST) In-Reply-To: <766B5A29D28DA442AB229AAEE2AFC44507D7B914F8@SEAMBX.corp.real.com> References: <766B5A29D28DA442AB229AAEE2AFC44507D7B914F6@SEAMBX.corp.real.com> <766B5A29D28DA442AB229AAEE2AFC44507D7B914F8@SEAMBX.corp.real.com> From: Jonathan Ellis Date: Wed, 16 Dec 2009 17:29:51 -0600 Message-ID: Subject: Re: OOM Exception To: cassandra-user@incubator.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable How large are the log files being replayed? Can you attach the log from a replay attempt? On Wed, Dec 16, 2009 at 5:21 PM, Brian Burruss wrote: > sorry, thought i included everything ;) > > however, i am using beta2 > > ________________________________________ > From: Jonathan Ellis [jbellis@gmail.com] > Sent: Wednesday, December 16, 2009 3:18 PM > To: cassandra-user@incubator.apache.org > Subject: Re: OOM Exception > > What version are you using? =A00.5 beta2 fixes the > using-more-memory-on-startup problem. > > On Wed, Dec 16, 2009 at 5:16 PM, Brian Burruss wrote: >> i'll put my question first: >> >> - how can i determine how much RAM is required by cassandra? =A0(for nor= mal operation and restarting server) >> >> *** i've attached my storage-conf.xml >> >> i've gotten several more OOM exceptions since i mentioned it a week or s= o ago. =A0i started from a fresh database a couple days ago and have been a= dding 2k blocks of data keyed off a random integer at the rate of about 400= /sec. =A0i have a 2 node cluster, RF=3D2, Consistency for read/write is ONE= . =A0there are ~70,420,082 2k blocks of data in the database. >> >> i used the default memory setup of Xmx1G when i started a couple days ag= o. =A0as the database grew to ~180G (reported by unix du command) both serv= ers OOM'ed at about the same time, within 10 minutes of each other. =A0well= needless to say, my cluster is dead. =A0so i upped the memory to 3G and th= e servers tried to come back up, but one died again with OOM. >> >> Before cleaning the disk and starting over a couple days ago, i played t= he game of "jack up the RAM", but eventually i didn't want to up it anymore= when i got to 5G. =A0the parameter, SSTable.INDEX_INTERVAL, was discussed = a few days ago that would change the number of "keys" cached in memory, so = i could modify that at the cost of read performance, but doing the math, 3G= should be plenty of room. >> >> it seems like startup requires more RAM than just normal running. >> >> so this of course concerns me. >> >> i have the hprof files from when the server initially crashed and when i= t crashed trying to restart if anyone wants them >> >