cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Roth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-13241) Lower default chunk_length_in_kb from 64kb to 4kb
Date Tue, 28 Feb 2017 21:40:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-13241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888911#comment-15888911
] 

Benjamin Roth commented on CASSANDRA-13241:
-------------------------------------------

How about this:

You create 2 chunk lookup tables. One with absolute pointers (long, 8 byte).
A second one with relative pointers or chunk-sizes - 2 bytes are enough for up to 64kb chunks.
You store an absolute pointer for every $x chunks (1000 in this example).
So you can get the absolute offset looking up the absolute position with $idx = ($pos - ($pos
% 100)) / $x
Then you iterate through the size lookup from ($pos - ($pos % 100)) to $pos - 1.
A fallback can be provided for chunks >64kb. Either relative pointers are completely avoided
or are increased to 3 bytes.

There you go.

Payload of 1 TB = 1024 * 1024 * 1024kb

CS 64 (NOW):
============
chunks = 1024 * 1024 * 1024kb / 64kb = 16777216 (16M)
compression = 1.99
compressed_size = 1024 * 1024 * 1024kb / 1.99 = 539568756kb
kernel_pages = 134892189
absolute_pointer_size = 8 * chunks = 134217728 (128MB)
kernel_page_size = 134892189 * 8 (1029 MB)
total_size = 1157MB

CS 4 with relative positions
============================
chunks = 1024 * 1024 * 1024kb / 4kb = 268435456 (256M)
compression = 1.75
compressed_size = 1024 * 1024 * 1024kb / 1.75 = 613566757kb
kernel_pages = 153391689
absolute_pointer_size = 8 * chunks / 1000 = 2147484 (2 MB)
relative_pointer_size = 2 * chunks = 536870912 (512 MB)
kernel_page_size = 153391689 * 8 = 1227133512 (1170MB)
total_size = 1684MB

increase = 45%

=> Reduces memory overhead when reducing chunk size from 64kb to 4kb from the initially
mentioned 800% to 45%
when you also take kernel structs into account which are also of a relevant size - even more
than the initially discussed "128M" for 64kb chunks

Pro:
A lot less memory required

Con:
Some CPU overhead. But is this really relevant compared to decompressing 4kb or even 64kb?

P.S.: Kernel memory calculation is based on the 8 bytes [~aweisberg] has researched. Compression
ratios are taken from the percona blog.

> Lower default chunk_length_in_kb from 64kb to 4kb
> -------------------------------------------------
>
>                 Key: CASSANDRA-13241
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13241
>             Project: Cassandra
>          Issue Type: Wish
>          Components: Core
>            Reporter: Benjamin Roth
>
> Having a too low chunk size may result in some wasted disk space. A too high chunk size
may lead to massive overreads and may have a critical impact on overall system performance.
> In my case, the default chunk size lead to peak read IOs of up to 1GB/s and avg reads
of 200MB/s. After lowering chunksize (of course aligned with read ahead), the avg read IO
went below 20 MB/s, rather 10-15MB/s.
> The risk of (physical) overreads is increasing with lower (page cache size) / (total
data size) ratio.
> High chunk sizes are mostly appropriate for bigger payloads pre request but if the model
consists rather of small rows or small resultsets, the read overhead with 64kb chunk size
is insanely high. This applies for example for (small) skinny rows.
> Please also see here:
> https://groups.google.com/forum/#!topic/scylladb-dev/j_qXSP-6-gY
> To give you some insights what a difference it can make (460GB data, 128GB RAM):
> - Latency of a quite large CF: https://cl.ly/1r3e0W0S393L
> - Disk throughput: https://cl.ly/2a0Z250S1M3c
> - This shows, that the request distribution remained the same, so no "dynamic snitch
magic": https://cl.ly/3E0t1T1z2c0J



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message