Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Tue, 17 Apr 2012 01:48:16 +0000 (UTC)
From: "Stu Hood (Commented) (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: 
 <2024022485.31351.1334627296778.JavaMail.tomcat@hel.zones.apache.org>
In-Reply-To: 
 <747530390.15621.1299959099579.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Commented] (CASSANDRA-2319) Promote row index
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/CASSANDRA-2319?page=3Dcom.atlas=
sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D=
13255246#comment-13255246 ]=20

Stu Hood commented on CASSANDRA-2319:
-------------------------------------

bq. I.e. the raison d'=C3=AAtre of both index_interval and column_index_siz=
e_in_kb is not because we have the notion of rows in the on-disk format.
If I'm understanding what Ellis is suggesting, it is that the entire sstabl=
e index could become sparse: that would mean that column_index_size_in_kb c=
ould be renamed to index_size_in_kb. index_interval would not change.
               =20
> Promote row index
> -----------------
>
>                 Key: CASSANDRA-2319
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2319
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Sylvain Lebresne
>              Labels: index, timeseries
>             Fix For: 1.2
>
>         Attachments: 2319-v1.tgz, 2319-v2.tgz, promotion.pdf, version-f.t=
xt, version-g-lzf.txt, version-g.txt
>
>
> The row index contains entries for configurably sized blocks of a wide ro=
w. For a row of appreciable size, the row index ends up directing the third=
 seek (1. index, 2. row index, 3. content) to nearby the first column of a =
scan.
> Since the row index is always used for wide rows, and since it contains i=
nformation that tells us whether or not the 3rd seek is necessary (the colu=
mn range or name we are trying to slice may not exist in a given sstable), =
promoting the row index into the sstable index would allow us to drop the m=
aximum number of seeks for wide rows back to 2, and, more importantly, woul=
d allow sstables to be eliminated using only the index.
> An example usecase that benefits greatly from this change is time series =
data in wide rows, where data is appended to the beginning or end of the ro=
w. Our existing compaction strategy gets lucky and clusters the oldest data=
 in the oldest sstables: for queries to recently appended data, we would be=
 able to eliminate wide rows using only the sstable index, rather than need=
ing to seek into the data file to determine that it isn't interesting. For =
narrow rows, this change would have no effect, as they will not reach the t=
hreshold for indexing anyway.
> A first cut design for this change would look very similar to the file fo=
rmat design proposed on #674: http://wiki.apache.org/cassandra/FileFormatDe=
signDoc: row keys clustered, column names clustered, and offsets clustered =
and delta encoded.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrato=
rs: https://issues.apache.org/jira/secure/ContactAdministrators!default.jsp=
a
For more information on JIRA, see: http://www.atlassian.com/software/jira