cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antoine Bonavita <anto...@stickyads.tv>
Subject Help diagnosing performance issue
Date Mon, 16 Nov 2015 09:04:16 GMT
Hello,

We have a performance problem when trying to ramp up cassandra (as a 
mongo replacement) on a very specific use case. We store a blob indexed 
by a key and expire it after a few days:

CREATE TABLE views.views (
     viewkey text PRIMARY KEY,
     value blob
) WITH bloom_filter_fp_chance = 0.01
     AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
     AND comment = ''
     AND compaction = {'max_sstable_age_days': '10', 'class': 
'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
     AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
     AND dclocal_read_repair_chance = 0.1
     AND default_time_to_live = 432000
     AND gc_grace_seconds = 172800
     AND max_index_interval = 2048
     AND memtable_flush_period_in_ms = 0
     AND min_index_interval = 128
     AND read_repair_chance = 0.0
     AND speculative_retry = '99.0PERCENTILE';

Our workload is mostly writes (approx. 96 writes for 4 reads). Each 
value is about 3kB. reads are mostly for "fresh" data (ie data that was 
written recently).

I have a 4 nodes cluster with spinning disks and a replication factor of 
3. For some historical reason 2 of the machines have 32G of RAM and the 
other 2 have 64G.

This is for the context.

Now, when I use this cluster at about 600 writes per second per node 
everything is fine but when I try to ramp it up (1200 writes per second 
per node) the read latencies are fine on the 64G machines but start 
going crazy on the 32G machines. When looking at disk iops, this is 
clearly related:
* On 32G machines, read iops go from 200 to 1400.
* On 64G machines, read iops go from 10 to 20.

So I thought this was related to the Memtable being flushed "too early" 
on 32G machines. I increased memtable_heap_space_in_mb to 4G on the 32G 
machines but it did not change anything.

At this point I'm kind of lost and could use any help in understanding 
why I'm generating so many read iops on the 32G machines compared to the 
64G one and why it goes crazy (x7) when I merely double the load.

Thanks,

A.

-- 
Antoine Bonavita (antoine@stickyads.tv) - CTO StickyADS.tv
Tel: +33 6 34 33 47 36/+33 9 50 68 21 32
NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID

Mime
View raw message