Mailing-List: contact cassandra-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cassandra-user@incubator.apache.org
Received-SPF: pass (athena.apache.org: domain of jbellis@gmail.com designates
 209.85.219.220 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:from:date:message-id:subject:to
         :content-type:content-transfer-encoding;
        b=o10kzenH4vJprxXrNSIa2sstJBc7LictfLPkpbLBql/4Qg67Bel50xUJMMJjo+HLx4
         eAb4Q2dOL4xiLx+iXABFKAC2RQ+OFByva7E3SaV1nW/AgpRw2YV0O8oILRIx93ZX9v+4
         SOGXFb1Kg5Nl3N795sxu4Ca5XG4PxFD6HYFUo=
MIME-Version: 1.0
In-Reply-To: <766B5A29D28DA442AB229AAEE2AFC44507D7B914F8@SEAMBX.corp.real.com>
References: <766B5A29D28DA442AB229AAEE2AFC44507D7B914F6@SEAMBX.corp.real.com>
	<e06563880912161518pdfd8871gc39a49c6ddd2731a@mail.gmail.com>
	<766B5A29D28DA442AB229AAEE2AFC44507D7B914F8@SEAMBX.corp.real.com>
From: Jonathan Ellis <jbellis@gmail.com>
Date: Wed, 16 Dec 2009 17:29:51 -0600
Message-ID: <e06563880912161529u563a6c5cgc9d874aa7c8e44b@mail.gmail.com>
Subject: Re: OOM Exception
To: cassandra-user@incubator.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

How large are the log files being replayed?

Can you attach the log from a replay attempt?

On Wed, Dec 16, 2009 at 5:21 PM, Brian Burruss <bburruss@real.com> wrote:
> sorry, thought i included everything ;)
>
> however, i am using beta2
>
> ________________________________________
> From: Jonathan Ellis [jbellis@gmail.com]
> Sent: Wednesday, December 16, 2009 3:18 PM
> To: cassandra-user@incubator.apache.org
> Subject: Re: OOM Exception
>
> What version are you using? =A00.5 beta2 fixes the
> using-more-memory-on-startup problem.
>
> On Wed, Dec 16, 2009 at 5:16 PM, Brian Burruss <bburruss@real.com> wrote:
>> i'll put my question first:
>>
>> - how can i determine how much RAM is required by cassandra? =A0(for nor=
mal operation and restarting server)
>>
>> *** i've attached my storage-conf.xml
>>
>> i've gotten several more OOM exceptions since i mentioned it a week or s=
o ago. =A0i started from a fresh database a couple days ago and have been a=
dding 2k blocks of data keyed off a random integer at the rate of about 400=
/sec. =A0i have a 2 node cluster, RF=3D2, Consistency for read/write is ONE=
. =A0there are ~70,420,082 2k blocks of data in the database.
>>
>> i used the default memory setup of Xmx1G when i started a couple days ag=
o. =A0as the database grew to ~180G (reported by unix du command) both serv=
ers OOM'ed at about the same time, within 10 minutes of each other. =A0well=
 needless to say, my cluster is dead. =A0so i upped the memory to 3G and th=
e servers tried to come back up, but one died again with OOM.
>>
>> Before cleaning the disk and starting over a couple days ago, i played t=
he game of "jack up the RAM", but eventually i didn't want to up it anymore=
 when i got to 5G. =A0the parameter, SSTable.INDEX_INTERVAL, was discussed =
a few days ago that would change the number of "keys" cached in memory, so =
i could modify that at the cost of read performance, but doing the math, 3G=
 should be plenty of room.
>>
>> it seems like startup requires more RAM than just normal running.
>>
>> so this of course concerns me.
>>
>> i have the hprof files from when the server initially crashed and when i=
t crashed trying to restart if anyone wants them
>>
>