cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Stepachev <oct...@gmail.com>
Subject Re: Cassandra OOM on repair.
Date Fri, 15 Jul 2011 15:02:38 GMT
Looks like key indexes eat all memory:

http://paste.kde.org/97213/


2011/7/15 Andrey Stepachev <octo47@gmail.com>

> UPDATE:
>
> I found, that
> a) with min10G cassandra survive.
> b) I have ~1000 sstables
> c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
>
> So, I have a question:
> a) if row is bigger then 64mb before compaction, why it compacted in memory
> b) if it smaller, what eats so much memory?
>
> 2011/7/15 Andrey Stepachev <octo47@gmail.com>
>
>> Hi all.
>>
>> Cassandra constantly OOM on repair or compaction. Increasing memory
>> doesn't help (6G)
>> I can give more, but I think that this is not a regular situation. Cluster
>> has 4 nodes. RF=3.
>> Cassandra version 0.8.1
>>
>> Ring looks like this:
>>  Address         DC          Rack        Status State   Load
>>  Owns    Token
>>
>>      127605887595351923798765477786913079296
>> xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
>> 25.00%  0
>> xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
>> 25.00%  42535295865117307932921825928971026432
>> xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
>> 25.00%  85070591730234615865843651857942052864
>> xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
>>  25.00%  127605887595351923798765477786913079296
>>
>> About schema:
>> I have big rows (>100k, up to several millions). But as I know, it is
>> normal for cassandra.
>> All things work relatively good, until I start long running pre-production
>> tests. I load
>> data and after a while (~4hours) cluster begin timeout and them some nodes
>> die with OOM.
>> My app retries to send, so after short period all nodes becomes down. Very
>> nasty.
>>
>> But now, I can OOM nodes by simple call nodetool repair.
>> In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
>> upper limit.
>> cfstats shows: http://paste.kde.org/96817/
>> config is: http://paste.kde.org/96823/
>> A question is: does anybody knows, what this means. Why cassandra tries to
>> load
>> something big into memory at once?
>>
>> A.
>>
>
>

Mime
View raw message