Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Fri, 13 Mar 2015 19:25:39 +0000 (UTC)
From: "Ariel Weisberg (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12699251.1394135641000.88977.1426274739549@Atlassian.JIRA>
In-Reply-To: <JIRA.12699251.1394135641000@Atlassian.JIRA>
References: <JIRA.12699251.1394135641000@Atlassian.JIRA>
 <JIRA.12699251.1394135641575@arcas>
Subject: [jira] [Commented] (CASSANDRA-6809) Compressed Commit Log
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360969#comment-14360969 ] 

Ariel Weisberg commented on CASSANDRA-6809:
-------------------------------------------

I don't think it works as a hard limit. Filesystems can hiccup for a long time and if you buffer to private memory you avoid seeing the hiccups.

A high watermark isn't great either because you commit memory that isn't needed most of the time. Maybe I am not following what you are suggesting.

When we have ponies we will be writing to private memory, probably around 128 megabytes, to avoid being at the mercy of the filesystem.

Once compression is asynchronous to the filesystem and parallel the # of buffers can be small because compression will tear through fast enough to make the buffers available again. So you would have memory waiting to drain to the filesystem (128 megabytes) and a small number of buffers to aggregate log records until they are sent for compression.

> Compressed Commit Log
> ---------------------
>
>                 Key: CASSANDRA-6809
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6809
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Assignee: Branimir Lambov
>            Priority: Minor
>              Labels: docs-impacting, performance
>             Fix For: 3.0
>
>         Attachments: ComitLogStress.java, logtest.txt
>
>
> It seems an unnecessary oversight that we don't compress the commit log. Doing so should improve throughput, but some care will need to be taken to ensure we use as much of a segment as possible. I propose decoupling the writing of the records from the segments. Basically write into a (queue of) DirectByteBuffer, and have the sync thread compress, say, ~64K chunks every X MB written to the CL (where X is ordinarily CLS size), and then pack as many of the compressed chunks into a CLS as possible.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)