cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Branimir Lambov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10520) Compressed writer and reader should support non-compressed data.
Date Fri, 24 Feb 2017 12:13:44 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15882551#comment-15882551
] 

Branimir Lambov commented on CASSANDRA-10520:
---------------------------------------------

Patch:
|[code|https://github.com/blambov/cassandra/tree/10520-addendum]|[utest|http://cassci.datastax.com/job/blambov-10520-addendum-testall/]|[dtest|http://cassci.datastax.com/job/blambov-10520-addendum-dtest/]|

Addresses the problem by making the default to not use uncompressed chunks, and make sure
the flag is not included in the compression-params-as-map when it matches the default. Unfortunately
this means anyone who wants to use this optimization will have to set the flag manually (and
assert by doing so that there won't be any need for communication with pre-V4 nodes), but
currently this appears to be the only option.

The dtest patch above is not necessary with this change (it wasn't yet committed).

> Compressed writer and reader should support non-compressed data.
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-10520
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10520
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>              Labels: messaging-service-bump-required
>             Fix For: 4.x
>
>         Attachments: ReadWriteTestCompression.java
>
>
> Compressing uncompressible data, as done, for instance, to write SSTables during stress-tests,
results in chunks larger than 64k which are a problem for the buffer pooling mechanisms employed
by the {{CompressedRandomAccessReader}}. This results in non-negligible performance issues
due to excessive memory allocation.
> To solve this problem and avoid decompression delays in the cases where it does not provide
benefits, I think we should allow compressed files to store uncompressed chunks as alternative
to compressed data. Such a chunk could be written after compression returns a buffer larger
than, for example, 90% of the input, and would not result in additional delays in writing.
On reads it could be recognized by size (using a single global threshold constant in the compression
metadata) and data could be directly transferred into the decompressed buffer, skipping the
decompression step and ensuring a 64k buffer for compressed data always suffices.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message