cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Burroughs (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-47) SSTable compression
Date Sat, 09 Jul 2011 13:41:17 GMT


Chris Burroughs commented on CASSANDRA-47:

.bq Using 64kb buffer 1.7GB file could be compressed into 110MB (data added using ./bin/stress
-n 1000000 -S 1024 -V, where -V option generates average size values and different cardinality
from 50 (default) to 250).

This seems like an unrealistically good compression ratio.  If I gzip a real world SSTable
that has redundant data that should be ripe for compression I only see 641M-->217M.  What's
the gzip compression ratio with the SSTables that workload generates?  

Stu, could you post your custom YCSB workload from CASSANDRA-674 for comparison?

> SSTable compression
> -------------------
>                 Key: CASSANDRA-47
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>              Labels: compression
>             Fix For: 1.0
>         Attachments: CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar
> We should be able to do SSTable compression which would trade CPU for I/O (almost always
a good trade).

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message