cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8729) Commitlog causes read before write when overwriting
Date Wed, 04 Feb 2015 20:51:35 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14305922#comment-14305922
] 

Ariel Weisberg commented on CASSANDRA-8729:
-------------------------------------------

Jira ate my response to this twice so far so I will be super brief.

The linked ticket is a different use case (cacheable random reads?) and not bulk append.

I put together a quick benchmark. Code http://pastebin.com/TFstk2uA
Tested on Windows 8.1, Samsung 840 EVO 250 gigabyte
{noformat}
Testing with sync at end
Channel took 5575
Preallocated Channel took 7445
Mapped took 8517
Preallocated Mapped took 7859
Testing with periodic syncing
Channel took 6795
Preallocated Channel took 8728
Mapped took 9991
Preallocated Mapped took 10123
{noformat}

There is no scenario where memory mapped IO is faster at bulk appending.

> Commitlog causes read before write when overwriting
> ---------------------------------------------------
>
>                 Key: CASSANDRA-8729
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8729
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Ariel Weisberg
>
> The memory mapped commit log implementation writes directly to the page cache. If a page
is not in the cache the kernel will read it in even though we are going to overwrite.
> The way to avoid this is to write to private memory, and then pad the write with 0s at
the end so it is page (4k) aligned before writing to a file.
> The commit log would benefit from being refactored into something that looks more like
a pipeline with incoming requests receiving private memory to write in, completed buffers
being submitted to a  parallelized compression/checksum step, followed by submission to another
thread for writing to a file that preserves the order.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message