cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6809) Compressed Commit Log
Date Fri, 23 Jan 2015 13:31:36 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14289242#comment-14289242
] 

Benedict commented on CASSANDRA-6809:
-------------------------------------

bq. Thank you, I did not realise you are interested in parallelism between segments only.

Well, I considered that a natural extension, i.e. a follow up ticket. One I still consider
reasonably straight forward to add: a mutator thread can partition the commit range once it's
processed ~1Mb, and simply append the Callable to a shared queue. The sync thread can then
drain this when it decides to initiate a sync.

bq. I can see that this should work well enough with large sync periods, including the 10s
default.

I'm reasonably confident this will work as well or better for all sync periods. In particular
it better guarantees honouring the sync periods, and is less likely to encourage random write
behaviour. Of course, the main benefit is its simplicity.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>         Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. Doing so should
improve throughput, but some care will need to be taken to ensure we use as much of a segment
as possible. I propose decoupling the writing of the records from the segments. Basically
write into a (queue of) DirectByteBuffer, and have the sync thread compress, say, ~64K chunks
every X MB written to the CL (where X is ordinarily CLS size), and then pack as many of the
compressed chunks into a CLS as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message