cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Radim Kolar <>
Subject Re: live ratio counting
Date Tue, 15 May 2012 08:53:05 GMT

> Try reducing memtable_total_space_in_mb config setting. If the problem 
> is incorrect memory metering that should help.
it does not helps much because difference in correct and cassandra 
assumed calculation is way too high. It would require me to shrink 
memtables to about 10% of their correct size leading to too much 

>> i have 3 workload types running in batch. Delete only workload, 
>> insert only and heavy update (lot of overwrites)
> Are you saying you do a lot of deletes, followed by a lot of inserts 
> and then updates all for the same CF ?
no. most common workload type is insert only. from time to time there 
are batch job doing lot of overwrites in memtables, and ocassionaly 
cleanup jobs doing only deletes. This breaks liveratio calculation too 
because cassandra assumes not only that average column size stored in 
memtable is constant but also that overwrite ratio in memtable is 
constant. If you overwrite too much cassandra starts to make very tiny 
sstables, if you delete too much there is risk of OOM.

>> yes. Record is about 120, but it is rare. 80 should be good enough. 
>> Default 10 (if not jusing jamm) is way too low.
> Can you provide some information on what is stored in the CF and what 
> sort of workload. It would be interesting to understand why the real 
> memory usage is 120 times the serialised size.
super column family:

   and column_metadata = [
     {column_name : 'crc32',
     validation_class : LongType},
     {column_name : 'id',
     validation_class : LongType},
     {column_name : 'name',
     validation_class : AsciiType},
     {column_name : 'size',
     validation_class : LongType}];

but this is not important, problem is that you do not calculate live 
ration frequently enough, if workload changes ratio looks like:

  INFO [MemoryMeter:1] 2012-05-12 21:11:51,649 (line 186) 
CFS(Keyspace='dedup', ColumnFamily='resultcache') liveRatio is 64.0 
(just-counted was 4.633391051722882).  calculation took 111ms for 4465 

why not recalculate it every 5 or 10 minutes. calculation takes just few 

View raw message