cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ran Tavory <ran...@gmail.com>
Subject Re: oom in ROW-MUTATION-STAGE
Date Sun, 23 May 2010 17:50:30 GMT
I am disk bound, certainly. I'll try adding more keys and row caching, but I
suspect it's a short blanket, if I add more caching I'll have less free
memory so more chance to OOM again. (is the cache using soft ref so it won't
take mem from real objects?)

On Sun, May 23, 2010 at 8:15 PM, Jonathan Ellis <jbellis@gmail.com> wrote:

> On Sun, May 23, 2010 at 10:59 AM, Ran Tavory <rantav@gmail.com> wrote:
> > Is there another solution except adding capacity?
>
> Either you need to get more performance/node or increase node count. :)
>
> > How does the ConcurrentReads (default 8) affect that? If I expect to have
> > similar number of reads and writes should I set the ConcurrentReads equal
> > to ConcurrentWrites (default 32) ?
>
> You should figure out where the bottleneck is, before tweaking things:
> http://spyced.blogspot.com/2010/01/linux-performance-basics.html
>
> Increasing CR will only help if you are (a) cpu bound and (b) have so
> many cores that 8 threads isn't saturating them.
>
> Sight unseen, my guess is you are disk bound.  iostat can confirm this.
>
> If that's the case then you can try to reduce the disk load w/ row
> cache or key cache.
>
> > On Sun, May 23, 2010 at 5:43 PM, Jonathan Ellis <jbellis@gmail.com>
> wrote:
> >>
> >> looks like reads are backing up, which in turn is making deserialize
> back
> >> up
> >>
> >> On Sun, May 23, 2010 at 4:25 AM, Ran Tavory <rantav@gmail.com> wrote:
> >> > Here's tpstats on a server with traffic that I think will get OOM
> >> > shortly.
> >> > We have 4k pending reads and 123k pending at MESSAGE-DESERIALIZER-POOL
> >> > Is there something I can do to prevent that? (other than adding
> RAM...)
> >> > Pool Name                    Active   Pending      Completed
> >> > FILEUTILS-DELETE-POOL             0         0             55
> >> > STREAM-STAGE                      0         0              6
> >> > RESPONSE-STAGE                    0         0              0
> >> > ROW-READ-STAGE                    8      4088        7537229
> >> > LB-OPERATIONS                     0         0              0
> >> > MESSAGE-DESERIALIZER-POOL         1    123799       22198459
> >> > GMFD                              0         0         471827
> >> > LB-TARGET                         0         0              0
> >> > CONSISTENCY-MANAGER               0         0              0
> >> > ROW-MUTATION-STAGE                0         0       14142351
> >> > MESSAGE-STREAMING-POOL            0         0             16
> >> > LOAD-BALANCER-STAGE               0         0              0
> >> > FLUSH-SORTER-POOL                 0         0              0
> >> > MEMTABLE-POST-FLUSHER             0         0            128
> >> > FLUSH-WRITER-POOL                 0         0            128
> >> > AE-SERVICE-STAGE                  1         1              8
> >> > HINTED-HANDOFF-POOL               0         0             10
> >> >
> >> > On Sat, May 22, 2010 at 11:05 PM, Ran Tavory <rantav@gmail.com>
> wrote:
> >> >>
> >> >> The message deserializer has 10m pending tasks before the oom. What
> do
> >> >> you
> >> >> think makes the message deserializer blow up? I'd suspect that when
> it
> >> >> goes
> >> >> up to 10m pending tasks, don't know how much mem a task actually
> takes
> >> >> up,
> >> >> but they may consume a lot of memory. Is there a setting I need to
> >> >> tweak?
> >> >> (or am I barking at the wrong tree?).
> >> >> I'll add the counters
> >> >> from http://github.com/jbellis/cassandra-munin-plugins but I already
> >> >> have
> >> >> most of them monitored, so I attached the graphs of the ones that
> >> >> seemed the
> >> >> most suspicious in the previous email.
> >> >> The system keyspace and HH CF don't look too bad, I think, here they
> >> >> are:
> >> >> Keyspace: system
> >> >>         Read Count: 154
> >> >>         Read Latency: 0.875012987012987 ms.
> >> >>         Write Count: 9
> >> >>         Write Latency: 0.20055555555555554 ms.
> >> >>         Pending Tasks: 0
> >> >>                 Column Family: LocationInfo
> >> >>                 SSTable count: 1
> >> >>                 Space used (live): 2714
> >> >>                 Space used (total): 2714
> >> >>                 Memtable Columns Count: 0
> >> >>                 Memtable Data Size: 0
> >> >>                 Memtable Switch Count: 3
> >> >>                 Read Count: 2
> >> >>                 Read Latency: NaN ms.
> >> >>                 Write Count: 9
> >> >>                 Write Latency: 0.011 ms.
> >> >>                 Pending Tasks: 0
> >> >>                 Key cache capacity: 1
> >> >>                 Key cache size: 1
> >> >>                 Key cache hit rate: NaN
> >> >>                 Row cache: disabled
> >> >>                 Compacted row minimum size: 203
> >> >>                 Compacted row maximum size: 397
> >> >>                 Compacted row mean size: 300
> >> >>                 Column Family: HintsColumnFamily
> >> >>                 SSTable count: 1
> >> >>                 Space used (live): 1457
> >> >>                 Space used (total): 4371
> >> >>                 Memtable Columns Count: 0
> >> >>                 Memtable Data Size: 0
> >> >>                 Memtable Switch Count: 0
> >> >>                 Read Count: 152
> >> >>                 Read Latency: 0.369 ms.
> >> >>                 Write Count: 0
> >> >>                 Write Latency: NaN ms.
> >> >>                 Pending Tasks: 0
> >> >>                 Key cache capacity: 1
> >> >>                 Key cache size: 1
> >> >>                 Key cache hit rate: 0.07142857142857142
> >> >>                 Row cache: disabled
> >> >>                 Compacted row minimum size: 829
> >> >>                 Compacted row maximum size: 829
> >> >>                 Compacted row mean size: 829
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> On Sat, May 22, 2010 at 4:14 AM, Jonathan Ellis <jbellis@gmail.com>
> >> >> wrote:
> >> >>>
> >> >>> Can you monitor cassandra-level metrics like the ones in
> >> >>> http://github.com/jbellis/cassandra-munin-plugins ?
> >> >>>
> >> >>> the usual culprit is usually compaction but your compacted row
size
> is
> >> >>> small.  nothing else really comes to mind.
> >> >>>
> >> >>> (you should check system keyspace too tho, HH rows can get large)
> >> >>>
> >> >>> On Fri, May 21, 2010 at 2:36 PM, Ran Tavory <rantav@gmail.com>
> wrote:
> >> >>> > I see some OOM on one of the hosts in the cluster and I wonder
if
> >> >>> > there's a
> >> >>> > formula that'll help me calculate what's the required memory
> setting
> >> >>> > given
> >> >>> > the parameters x,y,z...
> >> >>> > In short, I need advice on:
> >> >>> > 1. How to set up proper heap space and which parameters should
I
> >> >>> > look
> >> >>> > at
> >> >>> > when doing so.
> >> >>> > 2. Help setting up an alert policy and define some counter
> measures
> >> >>> > or
> >> >>> > sos
> >> >>> > steps an admin can take to prevent further degradation of
service
> >> >>> > when
> >> >>> > alerts fire.
> >> >>> > The OOM is at the row mutation stage and it happens after
> extensive
> >> >>> > GC
> >> >>> > activity. (log tail below).
> >> >>> > The server has 16G physical ram and java heap space 4G. No
other
> >> >>> > significant
> >> >>> > processes run on the same server. I actually upped the java
heap
> >> >>> > space
> >> >>> > to 8G
> >> >>> > but it OOMed again...
> >> >>> > Most of my settings are the defaults with a few keyspaces
and a
> few
> >> >>> > CFs
> >> >>> > in
> >> >>> > each KS. Here's the output of cfstats for the largest and
most
> >> >>> > heavily
> >> >>> > used
> >> >>> > CF. (currently reads/writes are stopped but data is there).
> >> >>> > Keyspace: outbrain_kvdb
> >> >>> >         Read Count: 3392
> >> >>> >         Read Latency: 160.33135908018866 ms.
> >> >>> >         Write Count: 2005839
> >> >>> >         Write Latency: 0.029233923061621595 ms.
> >> >>> >         Pending Tasks: 0
> >> >>> >                 Column Family: KvImpressions
> >> >>> >                 SSTable count: 8
> >> >>> >                 Space used (live): 21923629878
> >> >>> >                 Space used (total): 21923629878
> >> >>> >                 Memtable Columns Count: 69440
> >> >>> >                 Memtable Data Size: 9719364
> >> >>> >                 Memtable Switch Count: 26
> >> >>> >                 Read Count: 3392
> >> >>> >                 Read Latency: NaN ms.
> >> >>> >                 Write Count: 1998821
> >> >>> >                 Write Latency: 0.018 ms.
> >> >>> >                 Pending Tasks: 0
> >> >>> >                 Key cache capacity: 200000
> >> >>> >                 Key cache size: 11661
> >> >>> >                 Key cache hit rate: NaN
> >> >>> >                 Row cache: disabled
> >> >>> >                 Compacted row minimum size: 302
> >> >>> >                 Compacted row maximum size: 22387
> >> >>> >                 Compacted row mean size: 641
> >> >>> > I'm also attaching a few graphs of "the incidenst" I hope
they
> help.
> >> >>> > From
> >> >>> > the graphs it looks like:
> >> >>> > 1. message deserializer pool is behind so maybe taking too
much
> mem.
> >> >>> > If
> >> >>> > graphs are correct, it gets as high as 10m pending before
crash.
> >> >>> > 2. row-read-stage has a high number of pending (4k) so first
of
> all
> >> >>> > -
> >> >>> > this
> >> >>> > isn't good for performance whether it caused the oom or not,
and
> >> >>> > second,
> >> >>> > this may also have taken up heap space and caused the crash.
> >> >>> > Thanks!
> >> >>> >  INFO [GC inspection] 2010-05-21 00:53:25,885 GCInspector.java
> (line
> >> >>> > 110) GC
> >> >>> > for ConcurrentMarkSweep: 10819 ms, 939992 reclaimed leaving
> >> >>> > 4312064504
> >> >>> > used;
> >> >>> > max is 4431216640
> >> >>> >  INFO [GC inspection] 2010-05-21 00:53:44,605 GCInspector.java
> (line
> >> >>> > 110) GC
> >> >>> > for ConcurrentMarkSweep: 9672 ms, 673400 reclaimed leaving
> >> >>> > 4312337208
> >> >>> > used;
> >> >>> > max is 4431216640
> >> >>> >  INFO [GC inspection] 2010-05-21 00:54:23,110 GCInspector.java
> (line
> >> >>> > 110) GC
> >> >>> > for ConcurrentMarkSweep: 9150 ms, 402072 reclaimed leaving
> >> >>> > 4312609776
> >> >>> > used;
> >> >>> > max is 4431216640
> >> >>> > ERROR [ROW-MUTATION-STAGE:19] 2010-05-21 01:55:37,951
> >> >>> > CassandraDaemon.java
> >> >>> > (line 88) Fatal exception in thread
> >> >>> > Thread[ROW-MUTATION-STAGE:19,5,main]
> >> >>> > java.lang.OutOfMemoryError: Java heap space
> >> >>> > ERROR [Thread-10] 2010-05-21 01:55:37,951 CassandraDaemon.java
> (line
> >> >>> > 88)
> >> >>> > Fatal exception in thread Thread[Thread-10,5,main]
> >> >>> > java.lang.OutOfMemoryError: Java heap space
> >> >>> > ERROR [CACHETABLE-TIMER-2] 2010-05-21 01:55:37,951
> >> >>> > CassandraDaemon.java
> >> >>> > (line 88) Fatal exception in thread
> >> >>> > Thread[CACHETABLE-TIMER-2,5,main]
> >> >>> > java.lang.OutOfMemoryError: Java heap space
> >> >>> >
> >> >>>
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Jonathan Ellis
> >> >>> Project Chair, Apache Cassandra
> >> >>> co-founder of Riptano, the source for professional Cassandra support
> >> >>> http://riptano.com
> >> >>
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jonathan Ellis
> >> Project Chair, Apache Cassandra
> >> co-founder of Riptano, the source for professional Cassandra support
> >> http://riptano.com
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Mime
View raw message