cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10520) Compressed writer and reader should support non-compressed data.
Date Mon, 27 Feb 2017 10:09:46 GMT


Sylvain Lebresne commented on CASSANDRA-10520:

bq. should reopen CASSANDRA-11128

We don't re-open issues that have made it in a release, but it's worth opening a followup,

bq.  this means that for upgrades from 3.0/3.x to 4.0 users must ensure this/11128 is fixed.

Yes, and I wonder if avoiding this isn't a good enough reason to avoid enabling this by default,
_at least on existing tables_. I mean, in general, I wonder if we shouldn't default on being
more conservative for existing tables on upgrade. That is, what I'd suggest is that we'd default
existing table to this being disabled (no change from now), but enable it on new table (basically,
default to false if the table doesn't have the option, but force it to true on new tables
if not provided).

> Compressed writer and reader should support non-compressed data.
> ----------------------------------------------------------------
>                 Key: CASSANDRA-10520
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local Write-Read Paths
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>              Labels: messaging-service-bump-required
>             Fix For: 4.x
>         Attachments:
> Compressing uncompressible data, as done, for instance, to write SSTables during stress-tests,
results in chunks larger than 64k which are a problem for the buffer pooling mechanisms employed
by the {{CompressedRandomAccessReader}}. This results in non-negligible performance issues
due to excessive memory allocation.
> To solve this problem and avoid decompression delays in the cases where it does not provide
benefits, I think we should allow compressed files to store uncompressed chunks as alternative
to compressed data. Such a chunk could be written after compression returns a buffer larger
than, for example, 90% of the input, and would not result in additional delays in writing.
On reads it could be recognized by size (using a single global threshold constant in the compression
metadata) and data could be directly transferred into the decompressed buffer, skipping the
decompression step and ensuring a 64k buffer for compressed data always suffices.

This message was sent by Atlassian JIRA

View raw message