Return-Path: Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: (qmail 15124 invoked from network); 6 Dec 2010 20:17:24 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 Dec 2010 20:17:24 -0000 Received: (qmail 95207 invoked by uid 500); 6 Dec 2010 20:17:22 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 95181 invoked by uid 500); 6 Dec 2010 20:17:22 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 95171 invoked by uid 99); 6 Dec 2010 20:17:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Dec 2010 20:17:22 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a59.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Dec 2010 20:17:17 +0000 Received: from homiemail-a59.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTP id ACDCE56406C for ; Mon, 6 Dec 2010 12:16:56 -0800 (PST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=thelastpickle.com; h=to:from :subject:date:message-id:content-type:mime-version; q=dns; s= thelastpickle.com; b=N+8bgkL/whhcW25K1z5f9XKvcDg2IuUAZI2xSUdo1t+ 4Kc2na/GTZFgexn4rKiMMETc/1MQYbNcOo35hQSaylR01UsQs0dr6fJsugP9Dun9 kSZYYzM+FrM5FRaB1r2BVsbttaQLvtlSpVSc0B4SaP/35R2ra49eo0hWbJiwfvqY = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=to :from:subject:date:message-id:content-type:mime-version; s= thelastpickle.com; bh=NvCtHIom8IF13GwMrMZUyTcyYGE=; b=SUihrwisY2 5Wq+UpiTGsaYp2ft7Rw4fqaKXHeZTSDybMg1dBsA8Hw+duFZO/wJj9w8nDx6xTfN cST56b1Q7l81VWl0KmigJBsq/RIZwpNDTZCxtTLg62VJM6q/0UZwqvcNcIua2O3D PjTAdFgLCcwgL79tQwWEAE7Mcv/9ElMQ0= Received: from localhost (webms.mac.com [17.148.16.116]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a59.g.dreamhost.com (Postfix) with ESMTPSA id 9B1CF564061 for ; Mon, 6 Dec 2010 12:16:56 -0800 (PST) To: Cassandra User From: Aaron Morton Subject: Fwd: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM) Date: Mon, 06 Dec 2010 20:16:55 GMT X-Mailer: MobileMe Mail (1C321602) Message-id: <5c666dc0-7be2-6af6-dd8e-bf4a3b18cb4d@me.com> Content-Type: multipart/alternative; boundary=Apple-Webmail-42--5863d6bd-c3b7-e924-c31c-f7006db03677 MIME-Version: 1.0 --Apple-Webmail-42--5863d6bd-c3b7-e924-c31c-f7006db03677 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=ISO-8859-1; format=flowed Accidentally=A0sent to me.=0A=0ABegin forwarded message:=0AFrom: Max =0ADate: 07 December 2010 6:00:36 AM=0ATo: Aaron Morton =0ASubject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMe= mory (OOM)=0A=0AThank you both for your answer!=0AAfter several tests with= different parameters we came to the =0Aconclusion that it must be a bug.=0A= It looks very similar to: https://issues.apache.org/jira/browse/CASSANDRA-= 1014=0A=0AFor both CFs we reduced thresholds:=0A- memtable_flush_after_min= s =3D 60 (both CFs are used permanently, =0Atherefore other thresholds sho= uld trigger first)=0A- memtable_throughput_in_mb =3D 40=0A- memtable_opera= tions_in_millions =3D 0.3=0A- keys_cached =3D 0=0A- rows_cached =3D 0=0A=0A= - in_memory_compaction_limit_in_mb =3D 64=0A=0AFirst we disabled caching, = later we disabled compacting and after that we set=0Acommitlog_sync: batch= =0Acommitlog_sync_batch_window_in_ms: 1=0A=0ABut our problem still appears= :=0ADuring inserting files with Lucandra memory usage is slowly growing =0A= until OOM crash after about 50 min.=0A@Peter: In our latest test we stoppe= d writing suddenly but cassandra =0Adidn\'t relax and remains even after m= inutes on ~90% heap usage.=0Ahttp://oi54.tinypic.com/2dueeix.jpg=0A=0AWith= our heap calculation we should need:=0A64 MB * 2 * 3 + 1 GB =3D 1,4 GB=0A= All recent tests we run with 3 GB. I think that should be ok for a =0Atest= machine.=0AAlso consistency level is one.=0A=0ABut Aaron is right, Lucand= ra produces even more than 200 inserts/s.=0AMy 200 documents per second ar= e about 200 operations (writecount) on =0Afirst CF and about 3000 on secon= d CF.=0A=0ABut even with about 120 documents/s cassandra crashes.=0A=0A=0A= Disk I/O monitored with Windows performance admin tools is on both =0Adisc= s moderate (commitlog is on seperate harddisc).=0A=0A=0AAny ideas?=0AIf it= 's really a bug, in my opinion it's very critical.=0A=0A=0A=0AAaron Morton= wrote:=0A=0A> I remember you have 2 CF's but wh= at are the settings for:=A0=0A>=0A> - memtable_flush_after_mins=0A> -=A0me= mtable_throughput_in_mb=0A> -=A0memtable_operations_in_millions=0A> -=A0ke= ys_cached=0A> -=A0rows_cached=0A>=0A> -=A0in_memory_compaction_limit_in_mb= =0A>=0A> Can you do the JVM Heap Calculation here and see what it says=0A>= http://wiki.apache.org/cassandra/MemtableThresholds=0A>=0A> What Consiste= ncy Level are you writing at? (Checking =A0it's not Zero)=A0=0A>=0A> When = you talk about 200 inserts per second is that storing 200 =0A> documents t= hrough lucandra or 200 request to cassandra. If it's the =0A> first option= I would assume that would generate a lot more actual =0A> requests into c= assandra. Open up jconsole and take a look at the =0A> WriteCount settings= for the =0A> CF's=A0http://wiki.apache.org/cassandra/MemtableThresholds=0A= >=0A> You could also try setting the compaction thresholds to 0 to disable= =0A> compaction while you are pushing this data in. Then use node tool to=0A= > compact and turn the settings back to normal. See cassandra.yam for=0A> = more info.=0A>=0A> I would have thought you could get the writes through w= ith the setup=0A> you've described so far (even though a single 32bit node= is unusual).=0A> The best advice is to turn all the settings down (e.g. c= aches off,=0A> mtable flush 64MB, compaction disabled) and if it still fai= ls try:=0A>=0A> - checking your IO stats, not sure on windows but JConsole= has some IO=0A> stats. If your IO cannot keep up then your server is not = fast enough=0A> for your client load.=0A> - reducing the client load=0A>=0A= > Hope that helps.=A0=0A> Aaron=0A>=0A>=0A> On 04 Dec, 2010,at 05:23 AM, M= ax wrote:=0A>=0A> Hi,=0A>=0A> we increased heap space= to 3 GB (with JRocket VM under 32-bit Win with=0A> 4 GB RAM)=0A> but unde= r "heavy" inserts Cassandra is still crashing with OutOfMemory=0A> error a= fter a GC storm.=0A>=0A> It sounds very similar to =0A> https://issues.apa= che.org/jira/browse/CASSANDRA-1177=0A>=0A> In our insert-tests the average= heap usage is slowly growing up to the=0A> 3 GB border (jconsole monitor = over 50 min=0A> http://oi51.tinypic.com/k12gzd.jpg) and the CompactionMang= er queue is=0A> also constantly growing up to about 50 jobs pending.=0A>=0A= > We tried to decrease CF memtable threshold but after about half a=0A> mi= llion inserts it's over.=0A>=0A> - Cassandra 0.7.0 beta 3=0A> - Single Nod= e=0A> - about 200 inserts/s ~500byte - 1 kb=0A>=0A>=0A> Is there no other = possibility instead of slowing down inserts/s ?=0A>=0A> What could be an i= ndicator to see if a node works stable with this=0A> amount of inserts?=0A= >=0A> Thank you for your answer,=0A> Max=0A>=0A>=0A> Aaron Morton :=0A>=0A>> Sounds like you need to increase the Heap size= and/or reduce the =0A>> memtable_throughput_in_mb and/or turn off the int= ernal caches. =0A>> Normally the binary memtable thresholds only apply to = bulk load =0A>> operations and it's the per CF memtable_* settings you wan= t to =0A>> change. I'm not=A0familiar=A0with lucandra though.=A0=0A>>=0A>>= See the section on JVM Heap Size here=A0=0A>> http://wiki.apache.org/cass= andra/MemtableThresholds=0A>>=0A>> Bottom line is you will need more JVM h= eap memory.=0A>>=0A>> Hope that helps.=0A>> Aaron=0A>>=0A>> On 29 Nov, 201= 0,at 10:28 PM, cassandra@ajowa.de wrote:=0A>>=0A>> Hi community,=0A>>=0A>>= during my tests i had several OOM crashes.=0A>> Getting some hints to fin= d out the problem would be nice.=0A>>=0A>> First cassandra crashes after a= bout 45 min insert test script.=0A>> During the following tests time to OO= M was shorter until it started to crash=0A>> even in "idle" mode.=0A>>=0A>= > Here the facts:=0A>> - cassandra 0.7 beta 3=0A>> - using lucandra to ind= ex about 3 million files ~1kb data=0A>> - inserting with one client to one= cassandra node with about 200 files/s=0A>> - cassandra data files for thi= s keyspace grow up to about 20 GB=0A>> - the keyspace only contains the tw= o lucandra specific CFs=0A>>=0A>> Cluster:=0A>> - cassandra single node on= windows 32bit, Xeon 2,5 Ghz, 4GB RAM=0A>> - java jre 1.6.0_22=0A>> - heap= space first 1GB, later increased to 1,3 GB=0A>>=0A>> Cassandra.yaml:=0A>>= default + reduced "binary_memtable_throughput_in_mb" to 128=0A>>=0A>> CFs= :=0A>> default + reduced=0A>> min_compaction_threshold: 4=0A>> max_compact= ion_threshold: 8=0A>>=0A>>=0A>> I think the problem appears always during = compaction,=0A>> and perhaps it is a result of large rows (some about 170m= b).=0A>>=0A>> Are there more options we could use to work with few memory?= =0A>>=0A>> Is it a problem of compaction?=0A>> And how to avoid?=0A>> Slow= er inserts? More memory?=0A>> Even fewer memtable_throuput or in_memory_co= mpaction_limit?=0A>> Continuous manual major comapction?=0A>>=0A>> I've re= ad=0A>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dy= ing-with-oom-errors=0A>> - row_size should be fixed since 0.7 and 200mb is= still far away from 2gb=0A>> - only key cache is used a little bit 3600/2= 0000=0A>> - after a lot of writes cassandra crashes even in idle mode=0A>>= - memtablesize was reduced and there are only 2 CFs=0A>>=0A>> Several hea= pdumps in MAT show 60-99% heapusage of compaction thread.=0A --Apple-Webmail-42--5863d6bd-c3b7-e924-c31c-f7006db03677 Content-Type: multipart/related; type="text/html"; boundary=Apple-Webmail-86--5863d6bd-c3b7-e924-c31c-f7006db03677 --Apple-Webmail-86--5863d6bd-c3b7-e924-c31c-f7006db03677 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=ISO-8859-1;
Accidentally sent to me.

Begin forwarded message:<= br>
From: Max <cassandra@a= jowa.de>
Date: 07 December 2010 6:00:36 AMTo: Aaron Morton <aaron@thelastpickle.com>
Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
=
Thank you both for your answer!
=0AAfter se= veral tests with different parameters we came to the
=0Aconclusion th= at it must be a bug.
=0AIt looks very similar to: https://issues.apache.org/jira/browse= /CASSANDRA-1014
=0A
=0AFor both CFs we reduced thresholds:
=0A= - memtable_flush_after_mins =3D 60 (both CFs are used permanently,
=0A= therefore other thresholds should trigger first)
=0A- memtable_throughp= ut_in_mb =3D 40
=0A- memtable_operations_in_millions =3D 0.3
=0A- ke= ys_cached =3D 0
=0A- rows_cached =3D 0
=0A
=0A- in_memory_compact= ion_limit_in_mb =3D 64
=0A
=0AFirst we disabled caching, later we di= sabled compacting and after that we set
=0Acommitlog_sync: batch
=0A= commitlog_sync_batch_window_in_ms: 1
=0A
=0ABut our problem still ap= pears:
=0ADuring inserting files with Lucandra memory usage is slowly g= rowing
=0Auntil OOM crash after about 50 min.
=0A@Peter: In our la= test test we stopped writing suddenly but cassandra
=0Adidn\'t relax = and remains even after minutes on ~90% heap usage.
=0Ahttp://oi54.tinypic.com/2dueeix.jpg
=0A
=0AWith our heap = calculation we should need:
=0A64 MB * 2 * 3 + 1 GB =3D 1,4 GB
=0AAl= l recent tests we run with 3 GB. I think that should be ok for a
=0At= est machine.
=0AAlso consistency level is one.
=0A
=0ABut Aaron i= s right, Lucandra produces even more than 200 inserts/s.
=0AMy 200 docu= ments per second are about 200 operations (writecount) on
=0Afirst CF= and about 3000 on second CF.
=0A
=0ABut even with about 120 documen= ts/s cassandra crashes.
=0A
=0A
=0ADisk I/O monitored with Window= s performance admin tools is on both
=0Adiscs moderate (commitlog is = on seperate harddisc).
=0A
=0A
=0AAny ideas?
=0AIf it's really= a bug, in my opinion it's very critical.
=0A
=0A
=0A
=0AAaron= Morton <aaron@thelastpickle.com> wrote:
=0A
=0A> I remembe= r you have 2 CF's but what are the settings for: 
=0A>
=0A&g= t; - memtable_flush_after_mins
=0A> - memtable_throughput_in_mb=
=0A> - memtable_operations_in_millions
=0A> - keys_= cached
=0A> - rows_cached
=0A>
=0A> - in_memor= y_compaction_limit_in_mb
=0A>
=0A> Can you do the JVM Heap Cal= culation here and see what it says
=0A> http://wiki.apache.org/cassandra/MemtableThres= holds
=0A>
=0A> What Consistency Level are you writing at?= (Checking  it's not Zero) 
=0A>
=0A> When you talk = about 200 inserts per second is that storing 200
=0A> documents t= hrough lucandra or 200 request to cassandra. If it's the
=0A> fir= st option I would assume that would generate a lot more actual
=0A&g= t; requests into cassandra. Open up jconsole and take a look at the
= =0A> WriteCount settings for the
=0A> CF's http://wiki.apache.org/cassandra= /MemtableThresholds
=0A>
=0A> You could also try setting t= he compaction thresholds to 0 to disable
=0A> compaction while you a= re pushing this data in. Then use node tool to
=0A> compact and turn= the settings back to normal. See cassandra.yam for
=0A> more info.<= br>=0A>
=0A> I would have thought you could get the writes throug= h with the setup
=0A> you've described so far (even though a single = 32bit node is unusual).
=0A> The best advice is to turn all the sett= ings down (e.g. caches off,
=0A> mtable flush 64MB, compaction disab= led) and if it still fails try:
=0A>
=0A> - checking your IO s= tats, not sure on windows but JConsole has some IO
=0A> stats. If yo= ur IO cannot keep up then your server is not fast enough
=0A> for yo= ur client load.
=0A> - reducing the client load
=0A>
=0A>= ; Hope that helps. 
=0A> Aaron
=0A>
=0A>
=0A>= On 04 Dec, 2010,at 05:23 AM, Max <cassandra@ajowa.de> wrote:
=0A= >
=0A> Hi,
=0A>
=0A> we increased heap space to 3 GB = (with JRocket VM under 32-bit Win with
=0A> 4 GB RAM)
=0A> but= under "heavy" inserts Cassandra is still crashing with OutOfMemory
=0A= > error after a GC storm.
=0A>
=0A> It sounds very similar = to
=0A> https://issues.apache.org/jira/browse/CASSANDRA-1177
=0A>
=0A= > In our insert-tests the average heap usage is slowly growing up to th= e
=0A> 3 GB border (jconsole monitor over 50 min
=0A> http://oi51.tinypic.com/k12gzd.jpg) and the CompactionMang= er queue is
=0A> also constantly growing up to about 50 jobs pending=
=0A>
=0A> We tried to decrease CF memtable threshold but aft= er about half a
=0A> million inserts it's over.
=0A>
=0A>= ; - Cassandra 0.7.0 beta 3
=0A> - Single Node
=0A> - about 200= inserts/s ~500byte - 1 kb
=0A>
=0A>
=0A> Is there no ot= her possibility instead of slowing down inserts/s ?
=0A>
=0A> = What could be an indicator to see if a node works stable with this
=0A&= gt; amount of inserts?
=0A>
=0A> Thank you for your answer,=0A> Max
=0A>
=0A>
=0A> Aaron Morton <aaron@thela= stpickle.com>:
=0A>
=0A>> Sounds like you need to increa= se the Heap size and/or reduce the
=0A>> memtable_throughput_i= n_mb and/or turn off the internal caches.
=0A>> Normally the b= inary memtable thresholds only apply to bulk load
=0A>> operat= ions and it's the per CF memtable_* settings you want to
=0A>>= change. I'm not familiar with lucandra though. 
=0A>= >
=0A>> See the section on JVM Heap Size here 
=0A>= > http://wiki.a= pache.org/cassandra/MemtableThresholds
=0A>>
=0A>> B= ottom line is you will need more JVM heap memory.
=0A>>
=0A>= ;> Hope that helps.
=0A>> Aaron
=0A>>
=0A>> = On 29 Nov, 2010,at 10:28 PM, cassandra@ajowa.de wrote:
=0A>>
=0A= >> Hi community,
=0A>>
=0A>> during my tests i had= several OOM crashes.
=0A>> Getting some hints to find out the pr= oblem would be nice.
=0A>>
=0A>> First cassandra crashes= after about 45 min insert test script.
=0A>> During the followin= g tests time to OOM was shorter until it started to crash
=0A>> e= ven in "idle" mode.
=0A>>
=0A>> Here the facts:
=0A&g= t;> - cassandra 0.7 beta 3
=0A>> - using lucandra to index abo= ut 3 million files ~1kb data
=0A>> - inserting with one client to= one cassandra node with about 200 files/s
=0A>> - cassandra data= files for this keyspace grow up to about 20 GB
=0A>> - the keysp= ace only contains the two lucandra specific CFs
=0A>>
=0A>&= gt; Cluster:
=0A>> - cassandra single node on windows 32bit, Xeon= 2,5 Ghz, 4GB RAM
=0A>> - java jre 1.6.0_22
=0A>> - heap= space first 1GB, later increased to 1,3 GB
=0A>>
=0A>> = Cassandra.yaml:
=0A>> default + reduced "binary_memtable_throughp= ut_in_mb" to 128
=0A>>
=0A>> CFs:
=0A>> default= + reduced
=0A>> min_compaction_threshold: 4
=0A>> max_c= ompaction_threshold: 8
=0A>>
=0A>>
=0A>> I thin= k the problem appears always during compaction,
=0A>> and perhaps= it is a result of large rows (some about 170mb).
=0A>>
=0A>= ;> Are there more options we could use to work with few memory?
=0A&= gt;>
=0A>> Is it a problem of compaction?
=0A>> And h= ow to avoid?
=0A>> Slower inserts? More memory?
=0A>> Ev= en fewer memtable_throuput or in_memory_compaction_limit?
=0A>> C= ontinuous manual major comapction?
=0A>>
=0A>> I've read=
=0A>> http://w= ww.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-err= ors
=0A>> - row_size should be fixed since 0.7 and 200mb is s= till far away from 2gb
=0A>> - only key cache is used a little bi= t 3600/20000
=0A>> - after a lot of writes cassandra crashes even= in idle mode
=0A>> - memtablesize was reduced and there are only= 2 CFs
=0A>>
=0A>> Several heapdumps in MAT show 60-99% = heapusage of compaction thread.
=0A
--Apple-Webmail-86--5863d6bd-c3b7-e924-c31c-f7006db03677-- --Apple-Webmail-42--5863d6bd-c3b7-e924-c31c-f7006db03677--