cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hegerfors (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-8371) DateTieredCompactionStrategy is always compacting
Date Wed, 10 Dec 2014 16:14:14 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14241301#comment-14241301
] 

Björn Hegerfors edited comment on CASSANDRA-8371 at 12/10/14 4:14 PM:
----------------------------------------------------------------------

[~jshook] I don't understand what you're saying about "ideal scheduling". There might be some
confusion here, as Marcus's blog post about DTCS draws a simplified picture of how DTCS works.
In his picture, the rightmost vertical line represents "now". And while "now" certainly moves
forward, the other vertical lines, denoting window borders, should not actually move with
it. That's where his description is wrong (I just told him about it). Rather, these windows
borders are perfectly static, and the passage of time instead unveils new time windows. The
newest window (which "now" actually lies _inside_ of) is always base_time_seconds in size.
Then windows are merged with each other at certain points in time. This is an instantaneous
thing. Specifically, min_threshold windows of the same size are merged into one window at
exactly the moment when yet another window of that same size is created. Say that min_threshold=4
and base_time_seconds=60 (1 minute). Let's say that the last 4 windows are all 1-minute windows
(they certainly don't have to be, there can be anywhere between 1 and 4 same-sized windows).
At the turn of the next minute, a new 1-minute window is created, and the previous ones are
from that moment considered to be one 4-minute window (there is not moment when there are
5 1-minute windows).

The windows are the ideal SSTable placements for DTCS. The idea is that every window only
contains one SSTable, that spans the whole time window. In practice, this is also very nearly
what happens, except that the compaction triggered by windows merging is not instantaneous.
There are some quirks that let more than one SSTable live in one time window. CASSANDRA-8360
wants to address that. CASSANDRA-8361 takes it one step further.

It's true that repairs can put data in old windows in at later points. Read repairs don't
mix too well with DTCS for that reason, but anti-entropy repair costs so much that an extra
compaction at the end makes little difference. I think incremental repair should mix nicely
with DTCS, but I don't know much about it.

Sorry if you already knew all of this, but in that case, what is you definition of "perfect
scheduling"?


was (Author: bj0rn):
[~jshook] I don't understand what you're saying about "ideal scheduling". There might be some
confusion here, as Marcus's blog post about DTCS draws a simplified picture of how DTCS works.
In his picture, the rightmost vertical line represents "now". And while "now" certainly moves
forward, the other vertical lines, denoting window borders, should not actually move with
it. That's where his description is wrong (I just told him about it). Rather, these windows
borders are perfectly static, and the passage of time instead unveils new time windows. The
newest window (which "now" actually lies _inside_ of) is always base_time_seconds in size.
Then windows are merged with each other at certain points in time. This is an instantaneous
thing. Specifically, min_threshold windows of the same size are merged into one window at
exactly the moment when yet another window of that same size is created. Say that min_threshold=4
and base_time_seconds=60 (1 minute). Let's say that the last 4 windows are all 1-minute windows
(they certainly don't have to be, there can be anywhere between 1 and 4 same-sized windows).
At the turn of the next minute, a new 1-minute window is created, and the previous ones are
from that moment considered to be one 4-minute window (there is not moment when there are
5 1-minute windows).

The windows are the ideal SSTable placements for DTCS. The idea is that every window only
contains one SSTable, that spans the whole time window. In practice, this is also very nearly
what happens, except that the compaction triggered by windows merging is not instantaneous.
There are some quirks that let more than one SSTable live in one time window. CASSANDRA-8360
wants to address that. CASSANDRA-8361 takes it one step further.

It's true that repairs can data in old windows in at later points. Read repairs don't mix
too well with DTCS for that reason, but anti-entropy repair costs so much that an extra compaction
at the end makes little difference. I think incremental repair should mix nicely with DTCS,
but I don't know much about it.

Sorry if you already knew all of this, but in that case, what is you definition of "perfect
scheduling"?

> DateTieredCompactionStrategy is always compacting 
> --------------------------------------------------
>
>                 Key: CASSANDRA-8371
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8371
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: mck
>            Assignee: Björn Hegerfors
>              Labels: compaction, performance
>         Attachments: java_gc_counts_rate-month.png, read-latency-recommenders-adview.png,
read-latency.png, sstables-recommenders-adviews.png, sstables.png, vg2_iad-month.png
>
>
> Running 2.0.11 and having switched a table to [DTCS|https://issues.apache.org/jira/browse/CASSANDRA-6602]
we've seen that disk IO and gc count increase, along with the number of reads happening in
the "compaction" hump of cfhistograms.
> Data, and generally performance, looks good, but compactions are always happening, and
pending compactions are building up.
> The schema for this is 
> {code}CREATE TABLE search (
>   loginid text,
>   searchid timeuuid,
>   description text,
>   searchkey text,
>   searchurl text,
>   PRIMARY KEY ((loginid), searchid)
> );{code}
> We're sitting on about 82G (per replica) across 6 nodes in 4 DCs.
> CQL executed against this keyspace, and traffic patterns, can be seen in slides 7+8 of
https://prezi.com/b9-aj6p2esft/
> Attached are sstables-per-read and read-latency graphs from cfhistograms, and screenshots
of our munin graphs as we have gone from STCS, to LCS (week ~44), to DTCS (week ~46).
> These screenshots are also found in the prezi on slides 9-11.
> [~pmcfadin], [~Bj0rn], 
> Can this be a consequence of occasional deleted rows, as is described under (3) in the
description of CASSANDRA-6602 ?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message