Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 29477 invoked from network); 30 Nov 2010 20:41:42 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 30 Nov 2010 20:41:42 -0000 Received: (qmail 5138 invoked by uid 500); 30 Nov 2010 20:41:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 5047 invoked by uid 500); 30 Nov 2010 20:41:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 5039 invoked by uid 99); 30 Nov 2010 20:41:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Nov 2010 20:41:40 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a43.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Nov 2010 20:41:35 +0000 Received: from homiemail-a43.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a43.g.dreamhost.com (Postfix) with ESMTP id 157728C06E for ; Tue, 30 Nov 2010 12:41:14 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:date:message-id:content-type:mime-version:in-reply-to; q=dns; s=thelastpickle.com; b=dX3Lt/KFxyiuqmfqs4Hfh2gW0Ye9W8mXz 9xzYlnSPYwetUUWHK9SGNasNmcFj0YyTmqnw0O/8Unu1/RVMWyWjwVdc0OZvU7Xr BHsgprAeoguHKAuIQMwsBClI8CR7m2ilWUpty7R0WVKuDLwtnE5nvdkJwBptLk5z dJ1pwhmwzU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:date:message-id:content-type:mime-version: in-reply-to; s=thelastpickle.com; bh=qQWrC0C3/Bdh6ulv9twfg+aRGKk =; b=MCYvW3qqAhkLBHuc9elog+BdNbtksQmfUH7yXGzbQwlASrPRHyR4ZylAfao zo3MlN483FLfgGB/nSLkJpyQ1NFLa/Eo6WRTFOCXKk3ipIfEFXD7d0RBVvNXZQD2 Cw0l1l661ECoSYEzJigIW10GiGm/s1QxMp1+RJQXfT0ZuGxw= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a43.g.dreamhost.com (Postfix) with ESMTPSA id C6ACF8C06A for ; Tue, 30 Nov 2010 12:41:13 -0800 (PST) To: user@cassandra.apache.org From: Aaron Morton Subject: Re: JVM OOM on node startup Date: Tue, 30 Nov 2010 20:41:10 GMT X-Mailer: MobileMe Mail (1C3207) Message-id: <1e385857-68eb-668d-ae91-958404432b43@me.com> Content-Type: multipart/alternative; boundary=Apple-Webmail-42--f6ec6977-e60c-ca57-e12f-ccfba3c61765 MIME-Version: 1.0 In-Reply-To: <7D13E52A-CE61-41BE-9C87-8F27C726F8FC@grnoc.iu.edu> --Apple-Webmail-42--f6ec6977-e60c-ca57-e12f-ccfba3c61765 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; format=flowed Looks like it's trying to load your row cache and running out of memory, p= robably because you reduced the memory. The cassandra-env.sh script would = have been giving it 2GB.=A01Gb heap is probably going to be to small.=A0=0A= =0AWas this the same error you were getting before you reduced the memory = ?=A0=0A=0ATry deleting the caches, the path is specified by the saved_cach= es_directory setting in cassandra.yaml.=A0=0A=0AAlso what version are you = using ? The error=A0Caused by: javax.management.AttributeNotFoundException= : No such attribute: ActiveCount reminds me of a problem in beta 1.=A0=0A=0A= Hope that helps.=A0=0AAaron=0A=0AOn 01 Dec, 2010,at 09:28 AM, Brayton Thom= pson wrote:=0A=0AHello again.=0AWe have 3 nodes an= d were testing what happens when a node goes down. There is roughly 10gb o= f data on each node. The node we "simulated" dieing was working just fine = under the load. Then we killed it. The ring performed admirably, But upon = restarting the node it dies every time of JVM OOM errors. I have forced a = JVM heap size of 1024mb in the startup file. (did this because adaptive he= ap size was causing oom errors with normal usage.) The machines are 2 core= 4gb ram vm's.=0A=0AI've read the Riptano troubleshooting guide... http://= www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-er= rors But im not sure if these apply in this case since it is only dieing o= n startup.=0A=0AHere is a link to the startup logs as it dies.=0Ahttp://pa= stebin.com/BEXeVvCX=0A=0AThank you for any help you can provide. --Apple-Webmail-42--f6ec6977-e60c-ca57-e12f-ccfba3c61765 Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--f6ec6977-e60c-ca57-e12f-ccfba3c61765 --Apple-Webmail-86--f6ec6977-e60c-ca57-e12f-ccfba3c61765 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1;
Looks like it's trying to load your row cache and running out of memo= ry, probably because you reduced the memory. The cassandra-env.sh script w= ould have been giving it 2GB. 1Gb heap is probably going to be to sma= ll. 

Was this the same error you w= ere getting before you reduced the memory ? 

Try deleting the caches, the path is specified by the saved_caches_direct= ory setting in cassandra.yaml. 

Also what ve= rsion are you using ? The error Caused by: javax.management.AttributeNotFoundException: N= o such attribute: ActiveCount reminds me of a problem in beta 1. 

Hope that helps. 
Aaron

On 01 Dec, 2010,at 09:28 AM, Brayton Thompson <thompsbp@grnoc.iu.e= du> wrote:

Hello again.
=0A We have 3 nodes and were testing what happe= ns when a node goes down. There is roughly 10gb of data on each node. The = node we "simulated" dieing was working just fine under the load. Then we k= illed it. The ring performed admirably, But upon restarting the node it di= es every time of JVM OOM errors. I have forced a JVM heap size of 1024mb = in the startup file. (did this because adaptive heap size was causing oom = errors with normal usage.) The machines are 2 core 4gb ram vm's.
=0A=0AI've read the Riptano troubleshooting guide... http://www.riptano.com/docs/0.6/troubleshooting= /index#nodes-are-dying-with-oom-errors But im not sure if these apply = in this case since it is only dieing on startup.
=0A
=0AHere is a li= nk to the startup logs as it dies.
=0Ahttp://pastebin.com/BEX= eVvCX
=0A
=0AThank you for any help you can provide.
=
--Apple-Webmail-86--f6ec6977-e60c-ca57-e12f-ccfba3c61765-- --Apple-Webmail-42--f6ec6977-e60c-ca57-e12f-ccfba3c61765--