cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Witschey (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-10995) Consider disabling sstable compression by default in 3.x
Date Mon, 11 Jan 2016 22:05:40 GMT


Jim Witschey commented on CASSANDRA-10995:

One problem we currently have with benchmarking on-disk data size, in particular w.r.t. compression,
is this: we don't have tools that will generate representative, compressible data. It's easy
to generate random data ({{UUID}}s, random strings from {{cassandra-stress}}).

[~iamaleksey] How important is it that we use such a dataset? You'd know better than I, but
I don't imagine compressibility would effect resource utilization other than disk much.

> Consider disabling sstable compression by default in 3.x
> --------------------------------------------------------
>                 Key: CASSANDRA-10995
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>            Assignee: Jim Witschey
> With the new sstable format introduced in CASSANDRA-8099, it's very likely that enabled
sstable compression is no longer the right default option.
> [~slebresne]'s [blog post|] on the new
storage engine has some comparison numbers for 2.2/3.0, with and without compression that
show that in many cases compression no longer has a significant effect on sstable sizes -
all while sill consuming extra resources for both writes (compression) and reads (decompression).
> We should run a comprehensive set of benchmarks to determine whether or not compression
should be switched to 'off' now in 3.x.

This message was sent by Atlassian JIRA

View raw message