Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9FEE27CC7 for ; Wed, 27 Jul 2011 01:45:35 +0000 (UTC) Received: (qmail 34784 invoked by uid 500); 27 Jul 2011 01:45:35 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 34643 invoked by uid 500); 27 Jul 2011 01:45:35 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 34627 invoked by uid 99); 27 Jul 2011 01:45:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jul 2011 01:45:35 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Jul 2011 01:45:32 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 79C728BBE5 for ; Wed, 27 Jul 2011 01:45:11 +0000 (UTC) Date: Wed, 27 Jul 2011 01:45:11 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: <1251308141.10352.1311731111495.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1240572105.65378.1303156805945.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (CASSANDRA-2498) Improve read performance in update-intensive workload MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2498: -------------------------------------- Attachment: 2498-v2.txt Daniel, this is great! I've rebased and attached a v2 that makes some minor changes. The biggest is to avoid the tombstone collection by avoiding full collateColumns until the end. (This should be more efficient as well.) Is the same problem present with the memtable iterators? > Improve read performance in update-intensive workload > ----------------------------------------------------- > > Key: CASSANDRA-2498 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2498 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Jonathan Ellis > Assignee: Sylvain Lebresne > Priority: Minor > Labels: ponies > Fix For: 1.0 > > Attachments: 2498-v2.txt, supersede-name-filter-collations.patch > > > Read performance in an update-heavy environment relies heavily on compaction to maintain good throughput. (This is not the case for workloads where rows are only inserted once, because the bloom filter keeps us from having to check sstables unnecessarily.) > Very early versions of Cassandra attempted to mitigate this by checking sstables in descending generation order (mostly equivalent to descending mtime): once all the requested columns were found, it would not check any older sstables. > This was incorrect, because data timestamp will not correspond to sstable timestamp, both because compaction has the side effect of "refreshing" data to a newer sstable, and because hintead handoff may send us data older than what we already have. > Instead, we could create a per-sstable piece of metadata containing the most recent (client-specified) timestamp for any column in the sstable. We could then sort sstables by this timestamp instead, and perform a similar optimization (if the remaining sstable client-timestamps are older than the oldest column found in the desired result set so far, we don't need to look further). Since under almost every workload, client timestamps of data in a given sstable will tend to be similar, we expect this to cut the number of sstables down proportionally to how frequently each column in the row is updated. (If each column is updated with each write, we only have to check a single sstable.) > This may also be useful information when deciding which SSTables to compact. > (Note that this optimization is only appropriate for named-column queries, not slice queries, since we don't know what non-overlapping columns may exist in older sstables.) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira