cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-47) SSTable compression
Date Fri, 29 Jul 2011 14:38:12 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-47:
--------------------------------------

    Attachment: CASSANDRA-47-v5.patch

I agree with Jonathan on the "let's make the compression option even more like compaction
strategies". However, it may be simpler to commit this with the boolean flag and change it
in another ticket, when we make the compaction algorithm "pluggable". I'm willing to take
the responsibility to make sure that second ticket happens within a few days so that no-one
really "sees" the intermediate state where compression is a boolean flag.

Attaching a v5 that squashes v4 and v4-fixes and made the following changes/additions:
* Bump thrift version
* Fix a few places in CFMetaData where compression wasn't handled
* Fix resetAndTruncate(), both in CSW and SW (in SW, the "reset in current buffer" optimization
was broken, and for CSW, bufferOffset and chunkCount weren't reseted correctly, truncation
wasn't done at all and the chunk index wasn't reseted nor truncated). Also adds a unit test
for resetAndTruncate for both SW and CSW and for both the "reset in current buffer" optimization
and without.
* Slightly clean up CompressionMetadata: we now only keep the dataLengthOffset and write both
this and the chunk count at the same time at the end.
* Fix a problem with the handling of non-compressed sstable in cf where compression is activated.
Previous code was expecting the sstables to be compressed as soon as the compression flag
was set on the CF. Instead, the compression option is only used during writes to determine
if the resulting sstable is compressed. When opening a sstable however, the presence of CompressionInfo
is what is used to detect if it is compressed or not.
* Add a new build target 'ant test-compression' that runs the unit tests with compression
turned on for all CF. This is arguably a bit lame, and we should probably do better, but in
the meantime I think it's quick and useful.
* Add support for compression in stress (-I flag (supposed to mean "Inflate" since c, C and
i were all taken))


The changes above are fairly simple or cosmetic (except for resetAndTruncate maybe but there
the unit test now), and I think the patch is in a pretty good state so this has my +1 (great
work Pavel!). I'll still leave a bit of time for someone else to check those last changes
before committing.


> SSTable compression
> -------------------
>
>                 Key: CASSANDRA-47
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Jonathan Ellis
>            Assignee: Pavel Yaskevich
>              Labels: compression
>             Fix For: 1.0
>
>         Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47-v3-rebased.patch, CASSANDRA-47-v3.patch,
CASSANDRA-47-v4-fixes.patch, CASSANDRA-47-v4.patch, CASSANDRA-47-v5.patch, CASSANDRA-47.patch,
snappy-java-1.0.3-rc4.jar
>
>
> We should be able to do SSTable compression which would trade CPU for I/O (almost always
a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message