cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb
Date Thu, 18 Oct 2018 23:27:00 GMT


Ariel Weisberg commented on CASSANDRA-13241:

For those who were asking about the performance impact of block size on compression I wrote
a microbenchmark.

     [java] Benchmark                                               Mode  Cnt          Score
         Error  Units
     [java] CompactIntegerSequenceBench.benchCompressLZ4Fast16k    thrpt   15  331190055.685
±  8079758.044  ops/s
     [java] CompactIntegerSequenceBench.benchCompressLZ4Fast32k    thrpt   15  353024925.655
±  7980400.003  ops/s
     [java] CompactIntegerSequenceBench.benchCompressLZ4Fast64k    thrpt   15  365664477.654
± 10083336.038  ops/s
     [java] CompactIntegerSequenceBench.benchCompressLZ4Fast8k     thrpt   15  305518114.172
± 11043705.883  ops/s
     [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast16k  thrpt   15  688369529.911
± 25620873.933  ops/s
     [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast32k  thrpt   15  703635848.895
±  5296941.704  ops/s
     [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast64k  thrpt   15  695537044.676
± 17400763.731  ops/s
     [java] CompactIntegerSequenceBench.benchDecompressLZ4Fast8k   thrpt   15  727725713.128
±  4252436.331  ops/s

To summarize, compression is 8.5% slower and decompression is 1% faster. This is measuring
the impact on compression/decompression not the huge impact that would occur if we decompressed
data we don't need less often.

I didn't test decompression of Snappy and LZ4 high, but I did test compression.

     [java] CompactIntegerSequenceBench.benchCompressSnappy16k   thrpt    2  196574766.116
     [java] CompactIntegerSequenceBench.benchCompressSnappy32k   thrpt    2  198538643.844
     [java] CompactIntegerSequenceBench.benchCompressSnappy64k   thrpt    2  194600497.613
     [java] CompactIntegerSequenceBench.benchCompressSnappy8k    thrpt    2  186040175.059

LZ4 high compressor:
     [java] CompactIntegerSequenceBench.bench16k  thrpt    2  20822947.578          ops/s
     [java] CompactIntegerSequenceBench.bench32k  thrpt    2  12037342.253          ops/s
     [java] CompactIntegerSequenceBench.bench64k  thrpt    2   6782534.469          ops/s
     [java] CompactIntegerSequenceBench.bench8k   thrpt    2  32254619.594          ops/s

LZ4 high is the one instance where block size mattered a lot. It's a bit suspicious really
when you look at the ratio of performance to block size being close to 1:1. I couldn't spot
a bug in the benchmark though.

Compression ratios with LZ4 fast for the text of Alice in Wonderland was:

Chunk size 8192, ratio 0.709473
Chunk size 16384, ratio 0.667236
Chunk size 32768, ratio 0.634735
Chunk size 65536, ratio 0.607208

By way of comparison I also ran deflate with maximum compression:

Chunk size 8192, ratio 0.426434
Chunk size 16384, ratio 0.402423
Chunk size 32768, ratio 0.381627
Chunk size 65536, ratio 0.364865

> Lower default chunk_length_in_kb from 64kb to 4kb
> -------------------------------------------------
>                 Key: CASSANDRA-13241
>                 URL:
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>            Reporter: Benjamin Roth
>            Assignee: Ariel Weisberg
>            Priority: Major
>         Attachments:,,
> Having a too low chunk size may result in some wasted disk space. A too high chunk size
may lead to massive overreads and may have a critical impact on overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and avg reads
of 200MB/s. After lowering chunksize (of course aligned with read ahead), the avg read IO
went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / (total
data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but if the model
consists rather of small rows or small resultsets, the read overhead with 64kb chunk size
is insanely high. This applies for example for (small) skinny rows.
> Please also see here:
> To give you some insights what a difference it can make (460GB data, 128GB RAM):
> - Latency of a quite large CF:
> - Disk throughput:
> - This shows, that the request distribution remained the same, so no "dynamic snitch

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message