Hi Maxim,
 Sorry for the late reply but I was away for a course. Lower the memtable_flush_after_mins for your low traffic CFs. If in the meantime you upgraded to 1.0 (which by the way 1.0.3 for me ended not working and me converting a lot of data to it) I think there was a discussion you sent me on the group. I never experiemented with the new commitlog setting in 1.0 but the datastax website http://www.datastax.com/docs/1.0/configuration/node_configuration#commitlog-total-space-in-mb says this:


When the commitlog size on a node exceeds this threshold, Cassandra will flush memtables to disk for the oldest commitlog segments, thus allowing those log segments to be removed. This reduces the amount of data to replay on startup, and prevents infrequently-updated column families from keeping commit log segments around indefinitely. This replaces the per-column family storage setting memtable_flush_after_mins.


Tell me if it worked ,

On Wed, Dec 14, 2011 at 5:33 PM, Maxim Potekhin <potekhin@bnl.gov> wrote:
Alexandru, Jeremiah --

what setting needs to be tweaked, and what's the recommended value?

I observed similar behavior this morning.


On 11/28/2011 2:53 PM, Jeremiah Jordan wrote:
Yes, the low volume memtables are causing the problem.  Lower the thresholds for those tables if you don't want the commit logs to go crazy.


On 11/28/2011 11:11 AM, Alexandru Dan Sicoe wrote:
Hello everyone,

4 node Cassandra 0.8.5 cluster with RF=2, replica placement strategy = SimpleStartegy, write consistency level = ANY, memtable_flush_after_mins =1440; memtable_operations_in_millions=0.1; memtable_throughput_in_mb = 40; max_compaction_threshold =32; min_compaction_threshold =4;

I have one keyspace with 1 CF for all the data and 3 other small CFs for metadata. I am using Datastax OpsCenter to monitor my cluster so there is another keyspace for monitoring.

Everything works ok, the only thing I've noticed is this morning the commitlog of one node was 52GB, one was 25 GB and the others were around 3 GB. I left everything untouched and looked a couple of hours later and the 52GB one is now about 3GB and the 25 GB one is now 29 GB and the other two about the same as before.

Are my commit logs growing because of small memtables which don't get flushed because they don't reach the operations and throughput limits? Then why do only some nodes exhibit this behaviour?

It would be interesting to understand how to control the size of the commitlog also to know how to size my commitlog disks!


Alexandru Dan Sicoe
MEng, CERN Marie Curie ACEOLE Fellow