cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2319) Promote row index
Date Sun, 15 Apr 2012 17:31:19 GMT


Stu Hood commented on CASSANDRA-2319:

bq. What if we dropped the "main" index and just kept the "sample" index of every 1/128 columns?
This is what the original 674-v1 patch did, and it worked out fairly well. In the long run,
it would be a win in terms of config and code complexity, but it will probably be about the
same performance-wise.

>From 674-v1 (where ColumnKey is a full path to a column: (key, name1, name2, name3, ...)
+ * An entry in the SSTable index file which points to a block in the data file.
+ * Each entry contains the full path of the column at the beginning of the block
+ * in the SSTable, and two file positions: the offset of the serialized version
+ * of this object in the index, and the offset of the block in the data file.
+ *
+ * To find a key in the data file, we first look at IndexEntries in memory, and find
+ * the last entry less than the key we want. We then seek to the position of that
+ * entry in the index file, read forward until we find the last entry less than the
+ * key, and then seek to the position of that entry's block in the data file.
+ */
+public class IndexEntry extends ColumnKey
> Promote row index
> -----------------
>                 Key: CASSANDRA-2319
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Sylvain Lebresne
>              Labels: index, timeseries
>             Fix For: 1.2
>         Attachments: 2319-v1.tgz, 2319-v2.tgz, promotion.pdf, version-f.txt, version-g-lzf.txt,
> The row index contains entries for configurably sized blocks of a wide row. For a row
of appreciable size, the row index ends up directing the third seek (1. index, 2. row index,
3. content) to nearby the first column of a scan.
> Since the row index is always used for wide rows, and since it contains information that
tells us whether or not the 3rd seek is necessary (the column range or name we are trying
to slice may not exist in a given sstable), promoting the row index into the sstable index
would allow us to drop the maximum number of seeks for wide rows back to 2, and, more importantly,
would allow sstables to be eliminated using only the index.
> An example usecase that benefits greatly from this change is time series data in wide
rows, where data is appended to the beginning or end of the row. Our existing compaction strategy
gets lucky and clusters the oldest data in the oldest sstables: for queries to recently appended
data, we would be able to eliminate wide rows using only the sstable index, rather than needing
to seek into the data file to determine that it isn't interesting. For narrow rows, this change
would have no effect, as they will not reach the threshold for indexing anyway.
> A first cut design for this change would look very similar to the file format design
proposed on #674: row keys clustered,
column names clustered, and offsets clustered and delta encoded.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message