Hello Paul,

Thank you for the tip. The random port attribution policy of JMX was really making me mad ! Good to know there is a solution for that problem.

Concerning the rest of the conversation, my only concern is that as an administrator and a student it is hard to constantly watch  Cassandra instances so that they don't crash. As much as I love the principle of Cassandra, being constantly afraid of memory consumption is an issue in my opinion. That being said, I took a new 16 Gb server today, but I don't want Cassandra to eat up everything if it is not needed, because Cassandra will have some neighbors such as Tomcat, solR on this server.
And for me it is very weird that on my small instance where I put a lot of constraints like throughput_memtableInMb to 6 Cassandra uses 600 Mb of ram for 6 Mb of data. It seems to be a little bit of an overkill to me... And so far I failed to find any information on what this massive overhead can be...

Thank you for your answers and for taking the time to answer my questions.

2011/4/6 Paul Choi <paulchoi@plaxo.com>
You can use JMX over ssh by doing this:
Basically, you use SSH -D to do dynamic application port forwarding.

In terms of scaling, you'll be able to afford 120GB RAM/node in 3 years if you're successful. Or, a machine with much less RAM and flash-based storage. :)
Seriously, though, the formula in the tuning guidelines is a guideline. You can probably get acceptable performance with much less. If not, you can shard your app such that you host a few Cfs per cluster. I doubt you'll need to though.

From: openvictor Open <openvictor@gmail.com>
Reply-To: <user@cassandra.apache.org>
Date: Mon, 4 Apr 2011 18:24:25 -0400
To: <user@cassandra.apache.org>
Subject: Re: Abnormal memory consumption

Okay, I see. But isn't there a big issue for scaling here ?
Imagine that I am the developper of a certain very successful website : At year 1 I need 20 CF. I might need to have 8Gb of RAM. Year 2 I need 50 CF because I added functionalities to my wonderful webiste will I need 20 Gb of RAM ? And if at year three I had 300 Column families, will I need 120 Gb of ram / node ? Or did I miss something about memory consuption ?

Thank you very much,


2011/4/4 Peter Schuller <peter.schuller@infidyne.com>
> And about the production 7Gb or RAM is sufficient ? Or 11 Gb is the minimum
> ?
> Thank you for your inputs for the JVM I'll try to tune that

Production mem reqs are mostly dependent on memtable thresholds:


If you enable key caching or row caching, you will have to adjust
accordingly as well.

/ Peter Schuller