cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Dejanovski <a...@thelastpickle.com>
Subject Re: High IO Util using TimeWindowCompaction
Date Wed, 15 Nov 2017 10:24:44 GMT
Hi Kurt,

it seems highly unlikely that TWCS is responsible for your problems since
you're throttling compaction way below what i3 instances can provide.
For such instances, we would advise to use 8 concurrent compactors with
high compaction throughput (>200MB/s, if not unthrottled).

We've had reports and observed some inconsistent I/O behaviors with some i3
instances (not much lately though), so it could be what's biting you.
It would be helpful to provide a bit more info here to troubleshoot this :

   - The output of the following command during one of the 100% util phase
   : iostat -dmx 2 50
   - The output of : nodetool tablehistograms prod_dedupe event_hashes
   - The output of the following command during one of the 100% util phase
   : nodetool compactionstats -H
   - The output of : nodetool tpstats


Since you have very tiny partitions, we would advise to lower or disable
readahead, but you're not performing reads anyway on that cluster.

It would be good to check how 3.11 with TWCS performs on the same hardware
as the 3.7 cluster (c3.4xl) to narrow down the suspect list. Any chance you
can test this ?
Also, which OS are you using on the i3 instances ?

Thanks



On Mon, Nov 13, 2017 at 11:51 PM Kurtis Norwood <kurt@amplitude.com> wrote:

> I've been testing out cassandra 3.11 (currently using 3.7) and have been
> observing really high io util occasionally that sometimes results in
> temporary flatlining at 100% io util for an extended period. I think my use
> case is pretty simple and currently only testing part of it on this new
> version so looking for advice on what might be going wrong.
>
> Use Case: I am using cassandra as basically a large "set", my table schema
> is incredibly simple, just a primary key. Records are all written with the
> same TTL (7 days). Only queries are inserting a key (which we expect to
> only happen once) and checking whether that key exists in the table. In my
> 3.7 cluster I am using DateTieredCompaction and running on c3.4xlarge (x30)
> in AWS. I've been experimenting with i3.4xlarge and wanted to also try
> TimeWindowCompaction to see if we could get better performance when adding
> machines to the cluster, that was always a really painful experience in 3.7
> with DateTieredCompaction and the docs say TimeWindowCompaction is ideal
> for my use case.
>
> Right now I am running a new cluster with 3.11 and TimeWindowCompaction
> alongside the old cluster and doing writes to both. Only reads go to the
> old cluster while I go through this preliminary testing. So the 3.11
> cluster receives between 90K to 150K writes/second and no reads. This
> morning for a period of about 30 minutes the cluster was at 100% ioutil and
> eventually recovered from this state. At that time it was only receiving
> ~100K writes/second. I don't see anything interesting in the logs that
> indicate what is going on, and I don't think a sudden compaction is the
> issue since I have limits on compaction throughput.
>
> Staying on 3.7 would be a major bummer so looking for advice.
>
> Some information that might be useful:
>
> compaction throughput - 16MB/s
> concurrent compactors - 4
> machine type - i3.4xlarge (x20)
> disk - RAID0 across 2 NVMe SSDs
>
> Table Schema looks like this:
>
> CREATE TABLE prod_dedupe.event_hashes (
>
>     app int,
>
>     hash_value blob,
>
>     PRIMARY KEY ((app, hash_value))
>
> ) WITH bloom_filter_fp_chance = 0.01
>
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>
>     AND comment = 'For deduping'
>
>     AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.TimeWindowCompactionStrategy',
> 'compaction_window_size': '4', 'compaction_window_unit': 'HOURS',
> 'max_threshold': '64', 'min_threshold': '4'}
>
>     AND compression = {'chunk_length_in_kb': '4', 'class': '
> org.apache.cassandra.io.compress.LZ4Compressor'}
>
>     AND crc_check_chance = 1.0
>
>     AND dclocal_read_repair_chance = 0.0
>
>     AND default_time_to_live = 0
>
>     AND gc_grace_seconds = 3600
>
>     AND max_index_interval = 2048
>
>     AND memtable_flush_period_in_ms = 0
>
>     AND min_index_interval = 128
>
>     AND read_repair_chance = 0.0
>
>     AND speculative_retry = 'NONE';
>
>
> Thanks,
> Kurt
>
-- 
-----------------
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com

Mime
View raw message