cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-1177) OutOfMemory on heavy inserts
Date Wed, 09 Jun 2010 15:17:17 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877099#action_12877099
] 

Jonathan Ellis commented on CASSANDRA-1177:
-------------------------------------------

CSLM sucking up the memory sounds like you just have too much unflushed data in your memtables.

Do you have balanced tokens/"load" across the machines?  ("nodetool ring")

I would
 - balance nodes (with move) if necessary as described at the top of http://wiki.apache.org/cassandra/Operations
 - increase heap size and/or decrease memtable size + op count flush thresholds, or possibly
if you have some memtables way more active than others, leave the flush thresholds high but
reduce MemtableFlushAfterMinutes to flush out the less frequently used ones instead.

> OutOfMemory on heavy inserts
> ----------------------------
>
>                 Key: CASSANDRA-1177
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1177
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.6.2
>         Environment: SunOS 5.10, x86 32bit, Jave Hotspot Server VM 11.2-b01 mixed mode
> Sun SDK 1.6.0_12-b04
>            Reporter: Torsten Curdt
>            Priority: Critical
>         Attachments: bug report.zip
>
>
> We have cluster of 6 Cassandra 0.6.2 nodes running under SunOS (see environment).
> On initial import (using the thrift API) we see some weird behavior of half the cluster.
While cas04-06 look fine as you can see from the attached munin graphs, the other 3 nodes
kept on GCing (see log file) until they became unreachable and went OOM. (This is also why
the stats are so spotty - munin could no longer reach the boxes) We have seen the same behavior
on 0.6.2 and 0.6.1. This started after around 100 million inserts.
> Looking at the hprof (which is of course to big to attach) we see lots of ConcurrentSkipListMap$Node's
and quite some Column objects. Please see the stats attached.
> This looks similar to https://issues.apache.org/jira/browse/CASSANDRA-1014 but we are
not sure it really is the same.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message