Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7B4B19B09 for ; Tue, 17 Apr 2012 01:48:43 +0000 (UTC) Received: (qmail 37666 invoked by uid 500); 17 Apr 2012 01:48:43 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 37636 invoked by uid 500); 17 Apr 2012 01:48:43 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 37626 invoked by uid 99); 17 Apr 2012 01:48:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Apr 2012 01:48:43 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 Apr 2012 01:48:37 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id BD9C839A5A8 for ; Tue, 17 Apr 2012 01:48:16 +0000 (UTC) Date: Tue, 17 Apr 2012 01:48:16 +0000 (UTC) From: "Stu Hood (Commented) (JIRA)" To: commits@cassandra.apache.org Message-ID: <2024022485.31351.1334627296778.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <747530390.15621.1299959099579.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (CASSANDRA-2319) Promote row index MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-2319?page=3Dcom.atlas= sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D= 13255246#comment-13255246 ]=20 Stu Hood commented on CASSANDRA-2319: ------------------------------------- bq. I.e. the raison d'=C3=AAtre of both index_interval and column_index_siz= e_in_kb is not because we have the notion of rows in the on-disk format. If I'm understanding what Ellis is suggesting, it is that the entire sstabl= e index could become sparse: that would mean that column_index_size_in_kb c= ould be renamed to index_size_in_kb. index_interval would not change. =20 > Promote row index > ----------------- > > Key: CASSANDRA-2319 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2319 > Project: Cassandra > Issue Type: Improvement > Components: Core > Reporter: Stu Hood > Assignee: Sylvain Lebresne > Labels: index, timeseries > Fix For: 1.2 > > Attachments: 2319-v1.tgz, 2319-v2.tgz, promotion.pdf, version-f.t= xt, version-g-lzf.txt, version-g.txt > > > The row index contains entries for configurably sized blocks of a wide ro= w. For a row of appreciable size, the row index ends up directing the third= seek (1. index, 2. row index, 3. content) to nearby the first column of a = scan. > Since the row index is always used for wide rows, and since it contains i= nformation that tells us whether or not the 3rd seek is necessary (the colu= mn range or name we are trying to slice may not exist in a given sstable), = promoting the row index into the sstable index would allow us to drop the m= aximum number of seeks for wide rows back to 2, and, more importantly, woul= d allow sstables to be eliminated using only the index. > An example usecase that benefits greatly from this change is time series = data in wide rows, where data is appended to the beginning or end of the ro= w. Our existing compaction strategy gets lucky and clusters the oldest data= in the oldest sstables: for queries to recently appended data, we would be= able to eliminate wide rows using only the sstable index, rather than need= ing to seek into the data file to determine that it isn't interesting. For = narrow rows, this change would have no effect, as they will not reach the t= hreshold for indexing anyway. > A first cut design for this change would look very similar to the file fo= rmat design proposed on #674: http://wiki.apache.org/cassandra/FileFormatDe= signDoc: row keys clustered, column names clustered, and offsets clustered = and delta encoded. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp= a For more information on JIRA, see: http://www.atlassian.com/software/jira