Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
MIME-Version: 1.0
Sender: scode@scode.org
In-Reply-To: <AANLkTikgo-iFGZ7qmneNqLQZ3wUCeLcYS2bnH5_=geqT@mail.gmail.com>
References: <AANLkTikgo-iFGZ7qmneNqLQZ3wUCeLcYS2bnH5_=geqT@mail.gmail.com>
Date: Wed, 9 Feb 2011 18:57:25 +0100
Message-ID: <AANLkTimm1YFLnHs5OwaRPKdPeQT0366gf7HdQZHOZWYs@mail.gmail.com>
Subject: Re: Out of control memory consumption
From: Peter Schuller <peter.schuller@infidyne.com>
To: user@cassandra.apache.org
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

> We are 12-server cluster.=C2=A0 We use random partitioner with manually g=
enerated
> server tokens.=C2=A0 Memory usage on one server keeps growing out of cont=
rol.=C2=A0 We
> ran flush and cleared key and row caches but and ran GC but heap memory
> usage won't go down.=C2=A0 The only way to heap memory usage to go down i=
s the
> restart cassandra.=C2=A0 We have to do this one a day.=C2=A0 All other se=
rvers have
> heap memory usage less than 500MB.=C2=A0 This issue happened on both Cass=
andra
> 0.6.6 and 0.6.11.

To be clear: You are not talking about the size of the Java process in
top, but the actual amount of heap used as reported by the JVM via
jmx/jconsole/etc?

Is the memory amount of memory that you consider high, the heap size
just after a concurrent mark/sweep?

Are you actually seeing OOM:s or are you restarting the node
pre-emptively in response to seeing heap usage go up?


> And JVM memory allocation:=C2=A0 =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 -Xms3G -Xmx3G

Just FYI: So it is entirely expected that the JVM will be 3G (a bit
higher) in size (even with standard I/O) and further that the amount
of live data in the heap be approaching 3G. The concurrent mark/sweep
GC won't trigger until the initial occupancy reaches the limit (if
modern Cassandra with default settings).

If you've got a 3 gig heap size and the other nodes stay at 500 mb,
the question is why *don't* they increase in heap usage. Unless your
500 mb is the report of the actual live data set as evidenced by
post-CMS heap usage.

--=20
/ Peter Schuller