Return-Path: Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: (qmail 1230 invoked from network); 29 Mar 2010 04:42:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Mar 2010 04:42:51 -0000 Received: (qmail 54286 invoked by uid 500); 29 Mar 2010 04:42:51 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 54194 invoked by uid 500); 29 Mar 2010 04:42:51 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 54186 invoked by uid 99); 29 Mar 2010 04:42:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Mar 2010 04:42:50 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Mar 2010 04:42:48 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 22763234C1EE for ; Mon, 29 Mar 2010 04:42:27 +0000 (UTC) Message-ID: <938894262.541171269837747126.JavaMail.jira@brutus.apache.org> Date: Mon, 29 Mar 2010 04:42:27 +0000 (UTC) From: "Stu Hood (JIRA)" To: commits@cassandra.apache.org Subject: [jira] Updated: (CASSANDRA-847) Make the reading half of compactions memory-efficient MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood updated CASSANDRA-847: ------------------------------- Attachment: compaction-bench-trunk.txt compaction-bench-847.txt compaction-bench.patch Starting with step 0. from jbellis's suggested order, I implemented a compaction benchmark, and tested trunk and a rebased version of this patch. As expected, this patch is significantly faster for wide rows and slightly slower for narrow rows, but I'm sure that the performance can be improved by cleaning up the mergesort in SliceBuffer.merge(). The reason I bring this up now is that I began work on 767, and realized how painful it was going to be to make format changes before we had settled on an interface to replace FileDataInput/SSTableScanner. We need to remove/improve those interfaces before we go about making sweeping changes like the String->byte[] refactor. I think that this patch begins the 674 refactor in the correct place, so rather than starting over and losing more time, I would love to be able to clean up this patch and remove any structures that you guys think are excessive. If I can squash this down into much clearer patches and remove more of the speculative code, what are the chances of getting it committed? > Make the reading half of compactions memory-efficient > ----------------------------------------------------- > > Key: CASSANDRA-847 > URL: https://issues.apache.org/jira/browse/CASSANDRA-847 > Project: Cassandra > Issue Type: Improvement > Reporter: Stu Hood > Priority: Critical > Fix For: 0.7 > > Attachments: 0001-Add-structures-that-were-important-to-the-SSTableSca.patch, 0002-Implement-most-of-the-new-SSTableScanner-interface.patch, 0003-Rename-RowIndexedReader-specific-test.patch, 0004-Improve-Scanner-tests-and-separate-SuperCF-handling-.patch, 0005-Add-Scanner-interface-and-a-Filtered-implementation-.patch, 0006-Add-support-for-compaction-of-super-CFs-and-some-tes.patch, 0007-Remove-ColumnKey-bloom-filter-maintenance.patch, 0008-Make-Scanner-extend-Iterator-again.patch, 0009-Make-CompactionIterator-a-ReducingIterator-subclass-.patch, 0010-Alternative-to-ReducingIterator-that-can-return-mult.patch, compaction-bench-847.txt, compaction-bench-trunk.txt, compaction-bench.patch > > > This issue is the next on the road to finally fixing CASSANDRA-16. To make compactions memory efficient, we have to be able to perform the compaction process on the smallest possible chunks that might intersect and contend one-another, meaning that we need a better abstraction for reading from SSTables. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.