cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-47) SSTable compression
Date Tue, 19 Jul 2011 17:46:59 GMT


Sylvain Lebresne commented on CASSANDRA-47:

The thing about Input/Output classes was mentioned previously at CASSANDRA-1470. I -1 doing
"seek() method throw an exception if the CDF has been opened in "rw" mode" because this is
not a clean interface but I rather prefer to make separate classes as that will be a more
reasonable and clean design. Anyway, even right now common ancestor of both is RandomAccessFile
(or even FileDataInput). So I -1 doing merge of CDF and BRAF before we have a BRAF refactored.

I don't understand that argument. BRAF and CDF do the same thing, they only differ in that
CDF has a decompression/compression step while moving data in/out of the buffer and has a
slight translation between which part of the file to buffer. The rest of the code is the exact
same, all the buffer manipulation, when to sync, when to rebuffer, etc.. is the same. And
it's not the simplest code ever, not a place where having code duplication sound like a good

 I'm a bit conserved about adding one more file to handle a single SSTable, main goal of my
design here was to make CDF independent from other components of the system to avoid any additional

I don't see why adding a new component adds any complexity. I actually find it rather cleaner,
as that component would likely nicely correspond to an in-memory object holding all the metadata
related to compression.

maybe it's better to stream file offsets to the temporary file while SSTable being written
and after that store index section at the end of the file

If what you mean is what I understand that sound way more complicated that having a separate

We can use a magic number the same way as gzip does

That wouldn't be more reliable than the control bytes.

> SSTable compression
> -------------------
>                 Key: CASSANDRA-47
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>              Labels: compression
>             Fix For: 1.0
>         Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar
> We should be able to do SSTable compression which would trade CPU for I/O (almost always
a good trade).

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message