cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6809) Compressed Commit Log
Date Wed, 21 Jan 2015 10:48:36 GMT


Benedict commented on CASSANDRA-6809:

bq. assuming that the sync period is sane (e.g. ~100ms)

The sync period is, by default, 10s, and to my knowledge this is what many users run with
- so in general we will only compress each individual segment. This is still sane, since the
cluster has redundancy, although a sync period of between 100ms and 500ms might be more suitable
for high traffic nodes. Still, it's probably not a big deal since we only care about compression
when under saturation, which should mean many segments. I only mention it, since it is an
easy extension. This extension also means the sync thread may have compressed data _waiting_
for it when it runs, reducing the latency until sync completion.

bq. Let me try to rephrase what you are saying to make sure I understand it correctly:


* single sync thread forms sections at regular time intervals and sends them to compression
executor/phase (SPMC queue),
* _sync thread waits on futures and syncs each in order_

Or, with the extension:

* mutators periodically submit segment to compressor
* once compressor completes an entire segment, requestExtraSync() is called (instead of in

bq. Why is this simpler, or of comparable complexity?

We have two steps in explanation, instead of five. More importantly, there is no interleaving
of events to reason about between the sync threads, and the "lastSync" is accurate (which
is important since this could artificially pause writes). This also means future improvements
here are easier and safer to deliver, because we don't have to reason about how they interplay
with each other. In particular, marking lastSync roll over after each segment is synced is
a natural improvement (to ensure write latencies don't spike under load) but is challenging
to introduce with multiple sync threads. Since we don't expect this feature to be used widely
(we expect multiple CL disks to be used instead, if you're bottlenecking) the simpler approach
seems more sensible to me.

bq. Wouldn't the two extra queues waste resources and increase latency?

We have zero in the typical case, and one extra queue in the uncommon use case. If we introduce
enough threads that compression is faster than disk, then there will be near zero synchronization
costs; of course, if that is not the case, and we are bottlenecking on compression still,
then we aren't really losing much (a few micros. every few hundred millis, at 250MB/s compression
speed), so it doesn't seem likely to be significant. 

We're now no longer honouring the sync interval; we are syncing more frequently, which may
reduce disk throughput. The exact time of syncing in relation to each other may also vary,
likely into lock-step under saturation, so that there may be short periods of many competing
syncs potentially yielding pathological disk behaviour, and introducing competition for the
synchronized blocks inside the segments, in effect introducing a MPMC queue, eliminating those
few micros of benefit. 

(FTR, the MPMC, SPMC, MPSC aspects are likely not important here. The only concern is thread
signalling, but this is the wrong order of magnitude to matter when bottlenecking on disk
or compression of large chunks)

> Compressed Commit Log
> ---------------------
>                 Key: CASSANDRA-6809
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>         Attachments:, logtest.txt
> It seems an unnecessary oversight that we don't compress the commit log. Doing so should
improve throughput, but some care will need to be taken to ensure we use as much of a segment
as possible. I propose decoupling the writing of the records from the segments. Basically
write into a (queue of) DirectByteBuffer, and have the sync thread compress, say, ~64K chunks
every X MB written to the CL (where X is ordinarily CLS size), and then pack as many of the
compressed chunks into a CLS as possible.

This message was sent by Atlassian JIRA

View raw message