cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hegerfors (JIRA) <>
Subject [jira] [Commented] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting
Date Tue, 25 Nov 2014 11:01:13 GMT


Björn Hegerfors commented on CASSANDRA-8371:

Could you try with a lower base_time_seconds? I have a feeling that I set the default too
high at 1 hour. That's a setting that I didn't actually benchmark in my testing, and I just
set it rather arbitrarily. You also need to make sure the timestamp_resolution is set correctly.

You should think of base_time_seconds as DTCS's equivalent of min_sstable_size in STCS. min_sstable_size
is 50 MB by default, so you probably want to set base_time_seconds to whatever time it takes
you to write 50 MB, on average. I suspect that will be a lot less than 1 hour. You could also
try STCS with min_sstable_size set to the amount that you write in an hour, to see if that
starts compacting equally much.

If that's the cause here, I think that we should consider lowering the default value of base_time_seconds.
Having it too low is better than too high. Does anyone have an estimate of a common (on the
high end) MB/s write throughput for time series?

> DateTieredCompactionStrategy is always compacting 
> --------------------------------------------------
>                 Key: CASSANDRA-8371
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: mck
>            Assignee: Björn Hegerfors
>              Labels: compaction, performance
>         Attachments: java_gc_counts_rate-month.png, read-latency.png, sstables.png, vg2_iad-month.png
> Running 2.0.11 and having switched a table to [DTCS|]
we've seen that disk IO and gc count increase, along with the number of reads happening in
the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always happening, and
pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in slides 7+8 of
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and screenshots
of our munin graphs as we have gone from STCS, to LCS (week ~44), to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under (3) in the
description of CASSANDRA-6602 ?

This message was sent by Atlassian JIRA

View raw message