cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6809) Compressed Commit Log
Date Fri, 13 Mar 2015 19:25:39 GMT


Ariel Weisberg commented on CASSANDRA-6809:

I don't think it works as a hard limit. Filesystems can hiccup for a long time and if you
buffer to private memory you avoid seeing the hiccups.

A high watermark isn't great either because you commit memory that isn't needed most of the
time. Maybe I am not following what you are suggesting.

When we have ponies we will be writing to private memory, probably around 128 megabytes, to
avoid being at the mercy of the filesystem.

Once compression is asynchronous to the filesystem and parallel the # of buffers can be small
because compression will tear through fast enough to make the buffers available again. So
you would have memory waiting to drain to the filesystem (128 megabytes) and a small number
of buffers to aggregate log records until they are sent for compression.

> Compressed Commit Log
> ---------------------
>                 Key: CASSANDRA-6809
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: docs-impacting, performance
>             Fix For: 3.0
>         Attachments:, logtest.txt
> It seems an unnecessary oversight that we don't compress the commit log. Doing so should
improve throughput, but some care will need to be taken to ensure we use as much of a segment
as possible. I propose decoupling the writing of the records from the segments. Basically
write into a (queue of) DirectByteBuffer, and have the sync thread compress, say, ~64K chunks
every X MB written to the CL (where X is ordinarily CLS size), and then pack as many of the
compressed chunks into a CLS as possible.

This message was sent by Atlassian JIRA

View raw message