incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hefeng Yuan <hfy...@rhapsody.com>
Subject Re: Calculate number of nodes required based on data
Date Wed, 07 Sep 2011 18:55:46 GMT
Adi, just to make sure my calculation is correct, the configured ops threshold is ~2m, we have
6 nodes, does that mean each node's threshold is around 300k? I do see the when flushing happens,
ops is about 300k, with several 500k. Seems like the ops threshold is throttling us.

On Sep 7, 2011, at 11:31 AM, Adi wrote:

> On Wed, Sep 7, 2011 at 2:09 PM, Hefeng Yuan <hfyuan@rhapsody.com> wrote:
> We didn't change MemtableThroughputInMB/min/maxCompactionThreshold, they're 499/4/32.
> As for why we're flushing at ~9m, I guess it has to do with this: http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
> The only parameter I tried to play with is the compaction_throughput_mb_per_sec, tried
cutting it in half and doubled, seems none of them helps avoiding the simultaneous compactions
on nodes.
> 
> I agree that we don't necessarily need to add node, as long as we have a way to avoid
simultaneous compaction on 4+ nodes.
> 
> Thanks,
> Hefeng
> 
> 
> 
> Can you check in the logs for something like this 
> ...... Memtable.java (line 157) Writing Memtable-<ColumnFamilyName>@1151031968(67138588
bytes, 47430 operations)
> to see the bytes/operations at which the column family gets flushed. In case you are
hitting the operations threshold you can try increasing that to a high number. The operations
threshold is getting hit at  less than 2% of size threshold. I would try bumping up the memtable_operations
substantially. Default is 1.1624999999999999(in millions).  Try 10 or 20 and see if your CF
flushes at higher size. Keep adjusting it until the frequency/size of flushing becomes satisfactory
and hopefully reduces the compaction overhead.
> 
> -Adi
> 
> 
> 
> 
> 
>  
> On Sep 7, 2011, at 10:51 AM, Adi wrote:
> 
>> 
>> On Wed, Sep 7, 2011 at 1:09 PM, Hefeng Yuan <hfyuan@rhapsody.com> wrote:
>> Adi,
>> 
>> The reason we're attempting to add more nodes is trying to solve the long/simultaneous
compactions, i.e. the performance issue, not the storage issue yet.
>> We have RF 5 and CL QUORUM for read and write, we have currently 6 nodes, and when
4 nodes doing compaction at the same period, we're screwed, especially on read, since it'll
cover one of the compaction node anyways. 
>> My assumption is that if we add more nodes, each node will have less load, and therefore
need less compaction, and probably will compact faster, eternally avoid 4+ nodes doing compaction
simultaneously.
>> 
>> Any suggestion on how to calculate how many more nodes to add? Or, generally how
to plan for number of nodes required, from a performance perspective?
>> 
>> Thanks,
>> Hefeng
>> 
>> 
>> 
>> Adding nodes to delay and reduce compaction is an interesting performance use case
:-)  I am thinking you can find a smarter/cheaper way to manage that.
>> Have you looked at 
>> a) increasing memtable througput
>> What is the nature of your writes?  Is it mostly inserts or also has lot of quick
updates of recently inserted data. Increasing memtable_throughput can delay and maybe reduce
the compaction cost if you have lots of updates to same data.You will have to provide for
memory if you try this. 
>> When mentioned "with ~9m serialized bytes" is that the memtable throughput? That
is quite a low threshold which will result in large number of SSTables needing to be compacted.
I think the default is 256 MB and on the lower end values I have seen are 64 MB or maybe 32
MB.
>> 
>> 
>> b) tweaking min_compaction_threshold and max_compaction_threshold
>> - increasing min_compaction_threshold will delay compactions
>> - decreasing max_compaction_threshold will reduce number of sstables per compaction
cycle
>> Are you using the defaults 4-32 or are trying some different values
>> 
>> c) splitting column families
>> Again splitting column families can also help because compactions occur serially
one CF at a time and that spreads out your compaction cost over time and column families.
It requires change in app logic though.
>> 
>> -Adi
>> 
> 
> 


Mime
View raw message