Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F645185C4 for ; Thu, 24 Sep 2015 00:17:04 +0000 (UTC) Received: (qmail 2738 invoked by uid 500); 24 Sep 2015 00:17:04 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 2707 invoked by uid 500); 24 Sep 2015 00:17:04 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 2693 invoked by uid 99); 24 Sep 2015 00:17:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Sep 2015 00:17:04 +0000 Date: Thu, 24 Sep 2015 00:17:04 +0000 (UTC) From: "Sylvain Lebresne (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-10378) Make skipping more efficient MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-10378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14905561#comment-14905561 ] Sylvain Lebresne commented on CASSANDRA-10378: ---------------------------------------------- I pushed a quick patch implementing the idea above [here|https://github.com/pcmanus/cassandra/commits/10378]. The result on point queries can be point on [this graph|http://cstar.datastax.com/graph?stats=399e6124-616e-11e5-b8f9-42010af0688f&metric=op_rate&operation=3_user&smoothing=1&show_aggregates=true&xmin=0&xmax=152.68&ymin=0&ymax=110790.9]: basically, we get way much closer to 2.2 on those queries. > Make skipping more efficient > ---------------------------- > > Key: CASSANDRA-10378 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10378 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Benedict > Assignee: Sylvain Lebresne > Fix For: 3.0.0 rc2 > > > Following on from the impact of CASSANDRA-10322, we can improve the efficiency of our calls to skipping methods. CASSANDRA-10326 is showing our performance to be in-and-around the same ballpark except for seeks into the middle of a large partition, which suggests (possibly) that the higher density of data we're storing may simply be resulting in a more significant CPU burden as we have more data to skip over (and since CASSANDRA-10322 improves performance here really dramatically, further improvements are likely to be of similar benefit). > I propose doing our best to flatten the skipping of macro data items into as few skip invocations as necessary. One way of doing this would be to introduce a special {{skipUnsignedVInts(int)}} method, that can efficiently skip a number of unsigned vints. Almost the entire body of a cell and row consist of vints now, each data component with their own special {{skipX}} method that invokes {{readUnsignedVint}}. This would permit more efficient despatch. > We could also potentially avoid the construction of a new {{Columns}} instance for each row skip, since all we need is an iterator over the columns, and share the temporary space used for storing them, which should further reduce the GC burden for skipping many rows. -- This message was sent by Atlassian JIRA (v6.3.4#6332)