cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-2319) Promote row index
Date Sun, 20 Mar 2011 02:26:29 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2319?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13008867#comment-13008867
] 

Stu Hood commented on CASSANDRA-2319:
-------------------------------------

A solution for the key cache is to allow for fuzzy cache entry matches via a sorted cache
structure (if it existed, something like ConcurrentLinkedSkipListMap would be ideal).

The key cache as it exists gives us exact matches for keys, but when the resolution of the
cache increases to columns, the chance of hitting the same column twice (while reasonably
high) is not high enough. Ideally we'd be able to fuzzily hit a nearby/next-highest cache
entry that represent a range or block of columns.

An example: the cache contains an entry for columns in the range: {("user1","entry0100"),
("user1","entry0200")} if a query comes in for a slice starting from ("user1", "entry0150"),
we would perform a fuzzy/floor lookup in the cache and hit our entry. A lookup that doesn't
fall into the range covered by a cache entry would be a miss, and would result in reading
from the index, and the smallest range bounding the lookup being added to the cache.

> Promote row index
> -----------------
>
>                 Key: CASSANDRA-2319
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2319
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>              Labels: index, timeseries
>             Fix For: 0.8
>
>
> The row index contains entries for configurably sized blocks of a wide row. For a row
of appreciable size, the row index ends up directing the third seek (1. index, 2. row index,
3. content) to nearby the first column of a scan.
> Since the row index is always used for wide rows, and since it contains information that
tells us whether or not the 3rd seek is necessary (the column range or name we are trying
to slice may not exist in a given sstable), promoting the row index into the sstable index
would allow us to drop the maximum number of seeks for wide rows back to 2, and, more importantly,
would allow sstables to be eliminated using only the index.
> An example usecase that benefits greatly from this change is time series data in wide
rows, where data is appended to the beginning or end of the row. Our existing compaction strategy
gets lucky and clusters the oldest data in the oldest sstables: for queries to recently appended
data, we would be able to eliminate wide rows using only the sstable index, rather than needing
to seek into the data file to determine that it isn't interesting. For narrow rows, this change
would have no effect, as they will not reach the threshold for indexing anyway.
> A first cut design for this change would look very similar to the file format design
proposed on #674: http://wiki.apache.org/cassandra/FileFormatDesignDoc: row keys clustered,
column names clustered, and offsets clustered and delta encoded.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message