cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-674) New SSTable Format
Date Mon, 08 Aug 2011 05:29:31 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080757#comment-13080757
] 

Stu Hood commented on CASSANDRA-674:
------------------------------------

I reran the test mentioned in [#comment-13054228] with replicate-on-write disabled, which
makes for a much more fair comparison (trunk/47 require 2 seeks to miss for a column, and
3 to hit). This version of trunk also includes CASSANDRA-47 snappy compression.

|| build || disk volume (bytes) || bytes per column || runtime (s) || throughput (ops/s) ||
avg read ms || 99th % read ms ||
| trunk - uncompressed | 16,713,328,798 | 66.8 | 6154 | 40620 | 2.54 | 6 |
| trunk - gz 6 * | 2,747,319,000 | 10.98 |-|-|-|-|
| trunk - [snappy|https://issues.apache.org/jira/browse/CASSANDRA-47] | 4,356,461,652 | 17.4
| 7906 | 31618 | 4.64 | 15 |
| 674+2319 | 2,675,888,207 | 10.7 | 7703 | 32454 | 3.04 | 10 |
\* _trunk - gz 6_ is the size of compressing the data directory of the trunk result at GZIP
level 6

In this workload, we're reading from the tail of the row, which means that CASSANDRA-47 needs
to decode two blocks per read (one for the row index at the head of the row, and one for the
columns at the tail).

> New SSTable Format
> ------------------
>
>                 Key: CASSANDRA-674
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>             Fix For: 1.0
>
>         Attachments: 674-v1.diff, 674-v2.tgz, 674-v3.tgz, 674-ycsb.log, trunk-ycsb.log
>
>
> Various tickets exist due to limitations in the SSTable file format, including #16, #47
and #328. Attached is a proposed design/implementation of a new file format for SSTables that
addresses a few of these limitations.
> This v2 implementation is not ready for serious use: see comments for remaining issues.
It is roughly the format described here: http://wiki.apache.org/cassandra/FileFormatDesignDoc


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message