incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Cassandra OOM on repair.
Date Sun, 17 Jul 2011 20:31:56 GMT
Can't think of any.

On Sun, Jul 17, 2011 at 1:27 PM, Andrey Stepachev <octo47@gmail.com> wrote:
> Looks like problem in code:
>     public IndexSummary(long expectedKeys)
>     {
>         long expectedEntries = expectedKeys /
> DatabaseDescriptor.getIndexInterval();
>         if (expectedEntries > Integer.MAX_VALUE)
>             // TODO: that's a _lot_ of keys, or a very low interval
>             throw new RuntimeException("Cannot use index_interval of " +
> DatabaseDescriptor.getIndexInterval() + " with " + expectedKeys + "
> (expected) keys.");
>         indexPositions = new ArrayList<KeyPosition>((int)expectedEntries);
>     }
> I have too many keys, and too small index interval.
> To fix this, I can:
> 1) reduce number of keys - rewrite app and sacrifice balance
> 2) increase index_interval - hurt another column families
> A question:
> Are there any drawbacks for using different indexInterval for column
> families
> in keyspace? (suppose I'll write a patch)
> 2011/7/15 Andrey Stepachev <octo47@gmail.com>
>>
>> Looks like key indexes eat all memory:
>> http://paste.kde.org/97213/
>>
>> 2011/7/15 Andrey Stepachev <octo47@gmail.com>
>>>
>>> UPDATE:
>>> I found, that
>>> a) with min10G cassandra survive.
>>> b) I have ~1000 sstables
>>> c) CompactionManager uses PrecompactedRows instead of LazilyCompactedRow
>>> So, I have a question:
>>> a) if row is bigger then 64mb before compaction, why it compacted in
>>> memory
>>> b) if it smaller, what eats so much memory?
>>> 2011/7/15 Andrey Stepachev <octo47@gmail.com>
>>>>
>>>> Hi all.
>>>> Cassandra constantly OOM on repair or compaction. Increasing memory
>>>> doesn't help (6G)
>>>> I can give more, but I think that this is not a regular situation.
>>>> Cluster has 4 nodes. RF=3.
>>>> Cassandra version 0.8.1
>>>> Ring looks like this:
>>>>  Address         DC          Rack        Status State   Load
>>>>  Owns    Token
>>>>
>>>>        127605887595351923798765477786913079296
>>>> xxx.xxx.xxx.66  datacenter1 rack1       Up     Normal  176.96 GB
>>>> 25.00%  0
>>>> xxx.xxx.xxx.69  datacenter1 rack1       Up     Normal  178.19 GB
>>>> 25.00%  42535295865117307932921825928971026432
>>>> xxx.xxx.xxx.67  datacenter1 rack1       Up     Normal  178.26 GB
>>>> 25.00%  85070591730234615865843651857942052864
>>>> xxx.xxx.xxx.68  datacenter1 rack1       Up     Normal  175.2 GB
>>>>  25.00%  127605887595351923798765477786913079296
>>>> About schema:
>>>> I have big rows (>100k, up to several millions). But as I know, it is
>>>> normal for cassandra.
>>>> All things work relatively good, until I start long running
>>>> pre-production tests. I load
>>>> data and after a while (~4hours) cluster begin timeout and them some
>>>> nodes die with OOM.
>>>> My app retries to send, so after short period all nodes becomes down.
>>>> Very nasty.
>>>> But now, I can OOM nodes by simple call nodetool repair.
>>>> In logs http://paste.kde.org/96811/ it is clear, how heap rocketjump to
>>>> upper limit.
>>>> cfstats shows: http://paste.kde.org/96817/
>>>> config is: http://paste.kde.org/96823/
>>>> A question is: does anybody knows, what this means. Why cassandra tries
>>>> to load
>>>> something big into memory at once?
>>>> A.
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message