cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-47) SSTable compression
Date Tue, 19 Jul 2011 17:46:59 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13067860#comment-13067860
] 

Sylvain Lebresne commented on CASSANDRA-47:
-------------------------------------------

{quote}
The thing about Input/Output classes was mentioned previously at CASSANDRA-1470. I -1 doing
"seek() method throw an exception if the CDF has been opened in "rw" mode" because this is
not a clean interface but I rather prefer to make separate classes as that will be a more
reasonable and clean design. Anyway, even right now common ancestor of both is RandomAccessFile
(or even FileDataInput). So I -1 doing merge of CDF and BRAF before we have a BRAF refactored.
{quote}

I don't understand that argument. BRAF and CDF do the same thing, they only differ in that
CDF has a decompression/compression step while moving data in/out of the buffer and has a
slight translation between which part of the file to buffer. The rest of the code is the exact
same, all the buffer manipulation, when to sync, when to rebuffer, etc.. is the same. And
it's not the simplest code ever, not a place where having code duplication sound like a good
idea.

{quote}
 I'm a bit conserved about adding one more file to handle a single SSTable, main goal of my
design here was to make CDF independent from other components of the system to avoid any additional
complexity
{quote}

I don't see why adding a new component adds any complexity. I actually find it rather cleaner,
as that component would likely nicely correspond to an in-memory object holding all the metadata
related to compression.

{quote}
maybe it's better to stream file offsets to the temporary file while SSTable being written
and after that store index section at the end of the file
{quote}

If what you mean is what I understand that sound way more complicated that having a separate
component.

{quote}
We can use a magic number the same way as gzip does http://en.wikipedia.org/wiki/Gzip#File_format.
{quote}

That wouldn't be more reliable than the control bytes.

> SSTable compression
> -------------------
>
>                 Key: CASSANDRA-47
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>              Labels: compression
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar
>
>
> We should be able to do SSTable compression which would trade CPU for I/O (almost always
a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message