Connect with jconsole and watch the memory consumption
graph. Click the force GC button watch what the low point is, that
is how much memory is being used for persistent stuff, the rest
is garbage generated while satisfying queries. Run a query, watch how
the graph spikes up when you run your query, that is how much is needed for the
query. Like others have said, Cassandra isn't using 600Mb of RAM, the Java
Virtual Machine is using 600Mb of RAM, because your settings told it it
could. The JVM will use as much memory as your settings allow it to.
If you really are putting that little data into your test server, you should be
able to tune everything down to only 256Mb easily (I do this for test instances
of Cassandra that I spin up to run some tests on), maybe
Thank you for the tip. The random port attribution
policy of JMX was really making me mad ! Good to know there is a solution for
Concerning the rest of the conversation, my only concern is
that as an administrator and a student it is hard to constantly watch
Cassandra instances so that they don't crash. As much as I love the principle of
Cassandra, being constantly afraid of memory consumption is an issue in my
opinion. That being said, I took a new 16 Gb server today, but I don't want
Cassandra to eat up everything if it is not needed, because Cassandra will have
some neighbors such as Tomcat, solR on this server.
And for me it is very
weird that on my small instance where I put a lot of constraints like
throughput_memtableInMb to 6 Cassandra uses 600 Mb of ram for 6 Mb of data. It
seems to be a little bit of an overkill to me... And so far I failed to find any
information on what this massive overhead can be...
Thank you for your
answers and for taking the time to answer my questions.
2011/4/6 Paul Choi <email@example.com>
You can use JMX over ssh by doing this:
Basically, you use SSH -D to do dynamic application port
In terms of scaling, you'll be able to afford 120GB RAM/node in 3 years
if you're successful. Or, a machine with much less RAM and flash-based
Seriously, though, the formula in the tuning guidelines is a guideline.
You can probably get acceptable performance with much less. If not, you can
shard your app such that you host a few Cfs per cluster. I doubt you'll need
Okay, I see. But isn't there a big issue for scaling here ?
Imagine that I am the developper of a certain very successful website : At
year 1 I need 20 CF. I might need to have 8Gb of RAM. Year 2 I need 50 CF
because I added functionalities to my wonderful webiste will I need 20 Gb of
RAM ? And if at year three I had 300 Column families, will I need 120 Gb of
ram / node ? Or did I miss something about memory consuption ?
you very much,
2011/4/4 Peter Schuller <firstname.lastname@example.org>
> And about the production 7Gb or RAM is sufficient ? Or 11 Gb is
the minimumProduction mem reqs are mostly dependent on
> Thank you for your inputs for the JVM I'll try
to tune that
you enable key caching or row caching, you will have to
accordingly as well.