Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8FEC686A4 for ; Mon, 8 Aug 2011 05:30:04 +0000 (UTC) Received: (qmail 65551 invoked by uid 500); 8 Aug 2011 05:30:04 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 65359 invoked by uid 500); 8 Aug 2011 05:29:57 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 65343 invoked by uid 99); 8 Aug 2011 05:29:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Aug 2011 05:29:55 +0000 X-ASF-Spam-Status: No, hits=-2000.8 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Aug 2011 05:29:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 15819B005F for ; Mon, 8 Aug 2011 05:29:31 +0000 (UTC) Date: Mon, 8 Aug 2011 05:29:31 +0000 (UTC) From: "Stu Hood (JIRA)" To: commits@cassandra.apache.org Message-ID: <593793735.15716.1312781371084.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (CASSANDRA-674) New SSTable Format MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080757#comment-13080757 ] Stu Hood commented on CASSANDRA-674: ------------------------------------ I reran the test mentioned in [#comment-13054228] with replicate-on-write disabled, which makes for a much more fair comparison (trunk/47 require 2 seeks to miss for a column, and 3 to hit). This version of trunk also includes CASSANDRA-47 snappy compression. || build || disk volume (bytes) || bytes per column || runtime (s) || throughput (ops/s) || avg read ms || 99th % read ms || | trunk - uncompressed | 16,713,328,798 | 66.8 | 6154 | 40620 | 2.54 | 6 | | trunk - gz 6 * | 2,747,319,000 | 10.98 |-|-|-|-| | trunk - [snappy|https://issues.apache.org/jira/browse/CASSANDRA-47] | 4,356,461,652 | 17.4 | 7906 | 31618 | 4.64 | 15 | | 674+2319 | 2,675,888,207 | 10.7 | 7703 | 32454 | 3.04 | 10 | \* _trunk - gz 6_ is the size of compressing the data directory of the trunk result at GZIP level 6 In this workload, we're reading from the tail of the row, which means that CASSANDRA-47 needs to decode two blocks per read (one for the row index at the head of the row, and one for the columns at the tail). > New SSTable Format > ------------------ > > Key: CASSANDRA-674 > URL: https://issues.apache.org/jira/browse/CASSANDRA-674 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Stu Hood > Fix For: 1.0 > > Attachments: 674-v1.diff, 674-v2.tgz, 674-v3.tgz, 674-ycsb.log, trunk-ycsb.log > > > Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. > This v2 implementation is not ready for serious use: see comments for remaining issues. It is roughly the format described here: http://wiki.apache.org/cassandra/FileFormatDesignDoc -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira