cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-5021) Full partition/cell index integration
Date Mon, 10 Dec 2012 16:25:21 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13528041#comment-13528041
] 

Sylvain Lebresne commented on CASSANDRA-5021:
---------------------------------------------

For what it's worth, I advise that before getting to this we start by CASSANDRA-4478. Whatever
the exact implementation of this ticket is, it will break the current assumption that the
in-memory index summary is composed of fixed interval of keys. So the difficulties (related
to keys estimate) discussed in CASSANDRA-4478 will be the same. So this won't be a waste of
time imo.

That aside, I think this ticket involves:
* Merging RowIndexEntry and IndexHelper. I see two options: either the resulting entry look
like:
   {{(start key, start cell name) -> (end key, end cell name)}}
 which has the benefit that for skinny rows, we can have more than one row per index entry.
But I suspect this would complicate the implementation quite a bit, because it would means
the index wouldn't have all row keys and I think this would require more changes (for row
iteration, scrubs, ...).  Otherwise, we can keep the fact that an entry can't hold more than
one key and have an index entry be:
   {{key + (start cell name -> end cell name)}}
* Merge ColumnIndex.Builder and SSTableWriter.IndexWriter to write the new merged index entries.
* SSTableReader.getPosition() will need to be changed to take the start cell name as argument,
not just the key, and return a new style index entry.
* The consumer of SSTableReader.getPosition() will need to be update accordingly.

I do have to note that it's unclear to me what become of the key cache if we do this. Indeed,
an index entry position will be defined by a key and the start cell name of the index block.
But that has almost no cacheability: requests have little chance that their start be an index
block start. I don't think that's a detail, and I'd suggest considering this before jumping
into implementing this ticket.

                
> Full partition/cell index integration
> -------------------------------------
>
>                 Key: CASSANDRA-5021
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5021
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>             Fix For: 2.0
>
>
> CASSANDRA-2319 pulled the row's (partition's) index of cells into the -Index component,
but it's still treated separately.  That is: on a read, we first bsearch the partition key
samples, then we read the Index component to find the exact partition key, then we deserialize
the cell samples and bsearch those.
> "deserialize the cell samples" grows linearly with partition size and can seriously impact
query time as it grows past millions of cells to 10s and 100s of millions.
> If we merged the cell index with the partition's, we could do a single bsearch/read step
that would scale with log(N).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message