cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6809) Compressed Commit Log
Date Fri, 13 Mar 2015 13:39:39 GMT


Branimir Lambov commented on CASSANDRA-6809:

Rebased and updated [here|],
now using ByteBuffers for compressing. Some fixes to the {{DeflateCompressor}} implementation
are included.

If archiving fails it appears to delete the segment now. Is that the right thing to do?
It deletes if archival was successful ({{deleteFile = archiveSuccess}}). The old code was
doing the same thing, a bit more confusingly ({{deleteFile = !archiveSuccess ? false : true}}).

bq. CSLM's understanding of segment size is skewed because compressed segments are less than
the expected segment size in reality. With real compression ratios it's going to be off by
30-50%. If when the size is known it's tracking could be corrected it would be nice.

Unfortunately that's not trivial. Measurement must happen when the file is done writing, which
is a point CLSM doesn't currently have access to; moreover that could be triggered from either
sync() or close() on the segment and I don't want to include the risk that I don't get all
updates working correctly into this patch. Changed the description of the parameter to reflect
what it is currently measuring.

This can and should be fixed soon after, though, and I'll open an issue as soon as this is

bq. For the buffer pooling. I would be tempted to not wait for the collector to get to the
DBB. If the DBB is promoted due to compaction or some other allocation hog it may not be reclaimed
for some time. In CompressedSegment.close maybe null the field then invoke the cleaner on
the buffer. There is a utility method for doing that so you don't have to access the interface
directly (generates a compiler warning).


bq. Also make MAX_BUFFERPOOL_SIZE configurable via a property. I have been prefixing internal
C* properties with "cassandra." I suspect that at several hundred megabytes a second we will
have more than 3 32 megabyte buffers in flight. I have a personal fear of shipping constants
that aren't quite right and putting them all in properties can save waiting for code changes.

Good catch, the number of segments needed can depend on the sync period, thus this does need
to be exposed. Done, as a non-published option in cassandra.yaml for now.

bq. I tested on Linux. If I drop the page cache on the new code it doesn't generate reads.
I tested the old code and it generated a few hundred megabytes of reads.

Thank you.

> Compressed Commit Log
> ---------------------
>                 Key: CASSANDRA-6809
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: docs-impacting, performance
>             Fix For: 3.0
>         Attachments:, logtest.txt
> It seems an unnecessary oversight that we don't compress the commit log. Doing so should
improve throughput, but some care will need to be taken to ensure we use as much of a segment
as possible. I propose decoupling the writing of the records from the segments. Basically
write into a (queue of) DirectByteBuffer, and have the sync thread compress, say, ~64K chunks
every X MB written to the CL (where X is ordinarily CLS size), and then pack as many of the
compressed chunks into a CLS as possible.

This message was sent by Atlassian JIRA

View raw message