Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F01ED720B for ; Thu, 14 Jul 2011 19:26:26 +0000 (UTC) Received: (qmail 56970 invoked by uid 500); 14 Jul 2011 19:26:26 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 56623 invoked by uid 500); 14 Jul 2011 19:26:26 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 56144 invoked by uid 99); 14 Jul 2011 19:26:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jul 2011 19:26:25 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Jul 2011 19:26:22 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C9747580B5 for ; Thu, 14 Jul 2011 19:26:00 +0000 (UTC) Date: Thu, 14 Jul 2011 19:26:00 +0000 (UTC) From: "Yang Yang (JIRA)" To: commits@cassandra.apache.org Message-ID: <1211557427.14784.1310671560821.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <591486462.8092.1309507588533.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (CASSANDRA-2843) better performance on long row read MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: --------------------------------- Comment: was deleted (was: rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)) > better performance on long row read > ----------------------------------- > > Key: CASSANDRA-2843 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 > Project: Cassandra > Issue Type: New Feature > Reporter: Yang Yang > Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch > > > currently if a row contains > 1000 columns, the run time becomes considerably slow (my test of > a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. > this is all running in memory, no disk read is involved. > through debugging we can find > most of this time is spent on > [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) > [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) > [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) > [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) > [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) > [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) > [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) > ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. > this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. > but if we look at the whole read path, thrift already defines the read output to be List so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. > but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. > the provided patch is for demonstration now, will work further once we agree on the general direction. > CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. > slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. > this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. > thanks > Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira