cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Golick <jamesgol...@gmail.com>
Subject Re: Instability and memory problems
Date Sun, 20 Jun 2010 21:23:58 GMT
I opened #1214 about this. I hope people will take a look and provide their
feedback.

https://issues.apache.org/jira/browse/CASSANDRA-1214

Thanks.

On Sun, Jun 20, 2010 at 3:58 PM, James Golick <jamesgolick@gmail.com> wrote:

> uh. wow. I just read up on all this again, and read the code, and I'm a
> little surprised, to be honest.
>
> There's no attempt to manage the total size of the mmap()'d IO, and the
> default buffer allocation is quite sizeable. So, basically, if you have any
> data, over time, you will run out of memory, and there's no way at all to
> control it.
>
> Can we consider changing the default?
>
>
> On Sun, Jun 20, 2010 at 3:37 PM, James Golick <jamesgolick@gmail.com>wrote:
>
>> Thanks for your thoughts. Answers below:
>>
>> On Sun, Jun 20, 2010 at 2:21 PM, Peter Schuller <
>> peter.schuller@infidyne.com> wrote:
>>
>>> > The memory problems I've posted about before have gotten much worse and
>>> our
>>> > nodes are becoming incredibly slow/unusable every 24 hours or so.
>>> Basically,
>>> > the JVM reports that only 14GB is committed, but the RSS of the process
>>> is
>>> > 22GB, and cassandra is completely unresponsive, but still having
>>> requests
>>> > routed to it internally, so it completely destroys performance.
>>> > I'm at a loss for how to diagnose this issue.
>>>
>>> Sorry, I don't know the history of this (you mentioned you've alluded
>>> to the problems before), so maybe I am being redundant or missing
>>> something, but:
>>>
>>> (1) Is the machine swapping? (Actively swapping in/out as reported by
>>> e.g. vmstat)
>>>
>>
>> Yes, somewhat, although swappiness is set to 0.
>>
>>
>>> (2) Do the logs indicate that GC is running excessively, thus
>>> indicating an almost-out-of-heap condition?
>>>
>>
>> It runs, but I wouldn't say excessively.
>>
>>
>>> (3) mmap():ed memory that is currently resident will count towards
>>> RSS; if you're using mmap():ed I/O (the default), that is to be
>>> expected.
>>>
>>
>> This is where I'm a little confused. I thought that mmap()'d IO didn't
>> actually allocate memory. I thought it was just IO through a faster code
>> path.
>>
>>
>>> (4) If you are using mmap():ed I/O, that is also in and of itself
>>> something which can cause trouble if the operating system decides to
>>> swap your application out in favor of the mmap()
>>
>> (5) If you are swapping (see (1)), try switching from mmap():ed to
>>> standard I/O (due to (4)), and/or try decreasing the swappyness if
>>> you're on Linux (see /proc/sys/vm/swappiness).
>>>
>>
>> I tried switching to standard IO mode, but it was very, very slow. What
>> I'm confused about here is that if mmap()'d IO actually allocates memory
>> that can put pressure on other processes' memory, is there no way to bound
>> that? If not, how can anybody safely use mmap()'d IO on the JVM without
>> risking pushing their process's important pages out of memory.
>>
>> swappiness is already at 0.
>>
>>
>>> (6) Is Cassandra CPU bound or disk bound in general, regardless of
>>> swapping?
>>>
>>
>> Hard to tell because of the paging.
>>
>>
>>>
>>> --
>>> / Peter Schuller
>>>
>>
>>
>

Mime
View raw message