cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antoine Bonavita <anto...@stickyads.tv>
Subject Re: Help diagnosing performance issue
Date Tue, 17 Nov 2015 18:33:54 GMT
Hello,

As I have not heard from anybody on the list, I guess I did not provide 
the right kind of information or I did not ask the right question.

The things I forgot to mention in my previous email:
* Checked the logs without noticing anything out of the ordinary. 
Memtables flushes occur every few minutes.
* The compaction has been set to allow only one compaction at a time. 
Compaction throughput is the default.

My question is really: where should I look to investigate deeper ?
I did a lot of reading and watching datastax videos over the past week 
and I don't understand what could explain this behavior.

Or maybe my expectations are too high. But I was under the impression 
that this kind of workload (heavy writes) was the sweet spot for 
Cassandra and that a node should be able to sustain 10K writes per 
second without breaking a sweat.

Any help is appreciated. Much like any direction on what I should do to 
get help.

Thanks,

Antoine.

On 11/16/2015 10:04 AM, Antoine Bonavita wrote:
> Hello,
>
> We have a performance problem when trying to ramp up cassandra (as a
> mongo replacement) on a very specific use case. We store a blob indexed
> by a key and expire it after a few days:
>
> CREATE TABLE views.views (
>      viewkey text PRIMARY KEY,
>      value blob
> ) WITH bloom_filter_fp_chance = 0.01
>      AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>      AND comment = ''
>      AND compaction = {'max_sstable_age_days': '10', 'class':
> 'org.apache.cassandra.db.compaction.DateTieredCompactionStrategy'}
>      AND compression = {'sstable_compression':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>      AND dclocal_read_repair_chance = 0.1
>      AND default_time_to_live = 432000
>      AND gc_grace_seconds = 172800
>      AND max_index_interval = 2048
>      AND memtable_flush_period_in_ms = 0
>      AND min_index_interval = 128
>      AND read_repair_chance = 0.0
>      AND speculative_retry = '99.0PERCENTILE';
>
> Our workload is mostly writes (approx. 96 writes for 4 reads). Each
> value is about 3kB. reads are mostly for "fresh" data (ie data that was
> written recently).
>
> I have a 4 nodes cluster with spinning disks and a replication factor of
> 3. For some historical reason 2 of the machines have 32G of RAM and the
> other 2 have 64G.
>
> This is for the context.
>
> Now, when I use this cluster at about 600 writes per second per node
> everything is fine but when I try to ramp it up (1200 writes per second
> per node) the read latencies are fine on the 64G machines but start
> going crazy on the 32G machines. When looking at disk iops, this is
> clearly related:
> * On 32G machines, read iops go from 200 to 1400.
> * On 64G machines, read iops go from 10 to 20.
>
> So I thought this was related to the Memtable being flushed "too early"
> on 32G machines. I increased memtable_heap_space_in_mb to 4G on the 32G
> machines but it did not change anything.
>
> At this point I'm kind of lost and could use any help in understanding
> why I'm generating so many read iops on the 32G machines compared to the
> 64G one and why it goes crazy (x7) when I merely double the load.
>
> Thanks,
>
> A.
>

-- 
Antoine Bonavita (antoine@stickyads.tv) - CTO StickyADS.tv
Tel: +33 6 34 33 47 36/+33 9 50 68 21 32
NEW YORK | LONDON | HAMBURG | PARIS | MONTPELLIER | MILAN | MADRID

Mime
View raw message