cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Max <cassan...@ajowa.de>
Subject Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
Date Tue, 07 Dec 2010 13:11:21 GMT
As far as i can see, Lucandra already uses batch_mutations.
https://github.com/tjake/Lucandra/blob/master/src/lucandra/IndexWriter.java#L263
https://github.com/tjake/Lucandra/blob/master/src/lucandra/CassandraUtils.java#L371

IndexWriter.addDocument() merges all fields to a mutioation map.
In addition instead of "autoCommit" (commit each doc), i commit only  
every 10 documents. Where can i monitor incoming requests to cassandra?
WriteCount and MutationCount (monitored by jconsole) didn't change obviously.

I had problems to open the jrockit heapdump with MAT, but found  
"jrockit mission control" instead. Unfortunately i'm not confident  
using it.

Here my observations:
While heapByteBuffer was growing (~200mb) and flushed during client  
insert the byte[] was growing permanetly.
http://oi51.tinypic.com/2uhbdp3.jpg

I used TypeGraph to analyze the byte[] but i'm not sure how to interpret:
http://oi53.tinypic.com/y2d1i.jpg

Thank you!
Max

Aaron Morton <aaron@thelastpickle.com> wrote:
> Jake or anyone else got experience bulk loading into Lucandra ? 
>
> Or does anyone have experience with JRocket ? 
>
> Max, are you sending one document at a time into lucene. Can you   
> send them in batches (like solr), if so does it reduce the 
> amount of requests going to cassandra? 
>
> Also, cassandra.bat is configured   
> with XX:+HeapDumpOnOutOfMemoryError so you should be able to take a   
> look at where all the memory if going. Riptano blog points   
> to http://www.eclipse.org/mat/  also   
> see http://www.oracle.com/technetwork/java/javase/memleaks-137499.html#gdyrr
>
> Hope that helps. 
>
> Aaron
>
> On 07 Dec, 2010,at 09:17 AM, Aaron Morton <aaron@thelastpickle.com> wrote:
>
> Accidentally sent to me.
>
> Begin forwarded message:
> From: Max <cassandra@ajowa.de>
> Date: 07 December 2010 6:00:36 AM
> To: Aaron Morton <aaron@thelastpickle.com>
> Subject: Re: Re: Re: Cassandra 0.7 beta 3 outOfMemory (OOM)
>
> Thank you both for your answer!
> After several tests with different parameters we came to the conclusion
> that it must be a bug.
> It looks very similar to:   
> https://issues.apache.org/jira/browse/CASSANDRA-1014
>
> For both CFs we reduced thresholds:
> - memtable_flush_after_mins = 60 (both CFs are used permanently,
> therefore other thresholds should trigger first)
> - memtable_throughput_in_mb = 40
> - memtable_operations_in_millions = 0.3
> - keys_cached = 0
> - rows_cached = 0
>
> - in_memory_compaction_limit_in_mb = 64
>
> First we disabled caching, later we disabled compacting and after that we set
> commitlog_sync: batch
> commitlog_sync_batch_window_in_ms: 1
>
> But our problem still appears:
> During inserting files with Lucandra memory usage is slowly growing
> until OOM crash after about 50 min.
> @Peter: In our latest test we stopped writing suddenly but cassandra
> didn\'t relax and remains even after minutes on ~90% heap usage.
> http://oi54.tinypic.com/2dueeix.jpg
>
> With our heap calculation we should need:
> 64 MB * 2 * 3 + 1 GB = 1,4 GB
> All recent tests we run with 3 GB. I think that should be ok for a test
> machine.
> Also consistency level is one.
>
> But Aaron is right, Lucandra produces even more than 200 inserts/s.
> My 200 documents per second are about 200 operations (writecount) on
> first CF and about 3000 on second CF.
>
> But even with about 120 documents/s cassandra crashes.
>
>
> Disk I/O monitored with Windows performance admin tools is on both
> discs moderate (commitlog is on seperate harddisc).
>
>
> Any ideas?
> If it's really a bug, in my opinion it's very critical.
>
>
>
> Aaron Morton <aaron@thelastpickle.com> wrote:
>
>> I remember you have 2 CF's but what are the settings for: 
>>
>> - memtable_flush_after_mins
>> - memtable_throughput_in_mb
>> - memtable_operations_in_millions
>> - keys_cached
>> - rows_cached
>>
>> - in_memory_compaction_limit_in_mb
>>
>> Can you do the JVM Heap Calculation here and see what it says
>> http://wiki.apache.org/cassandra/MemtableThresholds
>>
>> What Consistency Level are you writing at? (Checking  it's not Zero) 
>>
>> When you talk about 200 inserts per second is that storing 200   
>> documents through lucandra or 200 request to cassandra. If it's the  
>>  first option I would assume that would generate a lot more actual   
>> requests into cassandra. Open up jconsole and take a look at the   
>> WriteCount settings for the   
>> CF's http://wikiapache.org/cassandra/MemtableThresholds
>>
>> You could also try setting the compaction thresholds to 0 to disable
>> compaction while you are pushing this data in. Then use node tool to
>> compact and turn the settings back to normal. See cassandra.yam for
>> more info.
>>
>> I would have thought you could get the writes through with the setup
>> you've described so far (even though a single 32bit node is unusual).
>> The best advice is to turn all the settings down (e.g. caches off,
>> mtable flush 64MB, compaction disabled) and if it still fails try:
>>
>> - checking your IO stats, not sure on windows but JConsole has some IO
>> stats. If your IO cannot keep up then your server is not fast enough
>> for your client load.
>> - reducing the client load
>>
>> Hope that helps. 
>> Aaron
>>
>>
>> On 04 Dec, 2010,at 05:23 AM, Max <cassandra@ajowa.de> wrote:
>>
>> Hi,
>>
>> we increased heap space to 3 GB (with JRocket VM under 32-bit Win with
>> 4 GB RAM)
>> but under "heavy" inserts Cassandra is still crashing with OutOfMemory
>> error after a GC storm.
>>
>> It sounds very similar to   
>> https://issues.apache.org/jira/browse/CASSANDRA-1177
>>
>> In our insert-tests the average heap usage is slowly growing up to the
>> 3 GB border (jconsole monitor over 50 min
>> http://oi51.tinypic.com/k12gzd.jpg) and the CompactionManger queue is
>> also constantly growing up to about 50 jobs pending
>>
>> We tried to decrease CF memtable threshold but after about half a
>> million inserts it's over.
>>
>> - Cassandra 0.7.0 beta 3
>> - Single Node
>> - about 200 inserts/s ~500byte - 1 kb
>>
>>
>> Is there no other possibility instead of slowing down inserts/s ?
>>
>> What could be an indicator to see if a node works stable with this
>> amount of inserts?
>>
>> Thank you for your answer,
>> Max
>>
>>
>> Aaron Morton <aaron@thelastpickle.com>:
>>
>>> Sounds like you need to increase the Heap size and/or reduce the   
>>> memtable_throughput_in_mb and/or turn off the internal caches.   
>>> Normally the binary memtable thresholds only apply to bulk load   
>>> operations and it's the per CF memtable_* settings you want to   
>>> change. I'm not familiar with lucandra though. 
>>>
>>> See the section on JVM Heap Size here 
>>> http://wiki.apache.org/cassandra/MemtableThresholds
>>>
>>> Bottom line is you will need more JVM heap memory.
>>>
>>> Hope that helps.
>>> Aaron
>>>
>>> On 29 Nov, 2010,at 10:28 PM, cassandra@ajowa.de wrote:
>>>
>>> Hi community,
>>>
>>> during my tests i had several OOM crashes.
>>> Getting some hints to find out the problem would be nice.
>>>
>>> First cassandra crashes after about 45 min insert test script.
>>> During the following tests time to OOM was shorter until it   
>>> started to crash
>>> even in "idle" mode.
>>>
>>> Here the facts:
>>> - cassandra 0.7 beta 3
>>> - using lucandra to index about 3 million files ~1kb data
>>> - inserting with one client to one cassandra node with about 200 files/s
>>> - cassandra data files for this keyspace grow up to about 20 GB
>>> - the keyspace only contains the two lucandra specific CFs
>>>
>>> Cluster:
>>> - cassandra single node on windows 32bit, Xeon 2,5 Ghz, 4GB RAM
>>> - java jre 1.6.0_22
>>> - heap space first 1GB, later increased to 1,3 GB
>>>
>>> Cassandra.yaml:
>>> default + reduced "binary_memtable_throughput_in_mb" to 128
>>>
>>> CFs:
>>> default + reduced
>>> min_compaction_threshold: 4
>>> max_compaction_threshold: 8
>>>
>>>
>>> I think the problem appears always during compaction,
>>> and perhaps it is a result of large rows (some about 170mb).
>>>
>>> Are there more options we could use to work with few memory?
>>>
>>> Is it a problem of compaction?
>>> And how to avoid?
>>> Slower inserts? More memory?
>>> Even fewer memtable_throuput or in_memory_compaction_limit?
>>> Continuous manual major comapction?
>>>
>>> I've read
>>> http://www.riptano.com/docs/0.6/troubleshooting/index#nodes-are-dying-with-oom-errors
>>> - row_size should be fixed since 0.7 and 200mb is still far away from 2gb
>>> - only key cache is used a little bit 3600/20000
>>> - after a lot of writes cassandra crashes even in idle mode
>>> - memtablesize was reduced and there are only 2 CFs
>>>
>>> Several heapdumps in MAT show 60-99% heapusage of compaction thread.

Mime
View raw message