cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CASSANDRA-6830) Changes to SSTable Index file
Date Mon, 30 Jun 2014 16:02:24 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-6830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Benedict resolved CASSANDRA-6830.
---------------------------------

    Resolution: Duplicate

> Changes to SSTable Index file
> -----------------------------
>
>                 Key: CASSANDRA-6830
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6830
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Benedict
>            Priority: Minor
>              Labels: performance
>             Fix For: 3.0
>
>
> Building on the ideas introduced in CASSANDRA-6709, and _possibly_ obseleting them before
they are introduced:
> Once we have CASSANDRA-6810, we could make the following change to the (current) index
file: instead of producing a sorted decoratedkey file, we could instead generate a near\-optimal
hash table of murmurhash\-of\-key \-> position in data/(6810\-)index file. This index might
permit multiple locations for each hash, in which case all locations would need to be checked,
but a hash table could be built that minimises this (whilst also maximising compact representation
on disk)
> This then might completely obviate the need for a separate key cache, as we simply rely
on whatever buffer cache we have to map in/out the pages we need for our query in any index.
We should be able to guarantee we only ever need to look at one page for any query. Once we
bring page-caching in process, the size of the pages we actually choose to cache could be
configurable which would bring behaviour to near same as key cache currently stands, except
more compact, and also effectively auto-sizing itself to optimally reduce reads (by using
more buffer cache space if it is helpful, and yielding it to other reads otherwise).
> The obvious disadvantage is that partition key ranges become a little more expensive,
but (the?/)an index summary should reduce the problem here, so that binary search for a start
point can be targeted to a few or single (6810\-)index page.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message