cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-6810) SSTable and Index Layout Improvements/Modifications
Date Thu, 06 Mar 2014 20:10:42 GMT
Benedict created CASSANDRA-6810:
-----------------------------------

             Summary: SSTable and Index Layout Improvements/Modifications
                 Key: CASSANDRA-6810
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6810
             Project: Cassandra
          Issue Type: Improvement
          Components: Core
            Reporter: Benedict
             Fix For: 3.0


Right now SSTables are somewhat inefficient in their storage of composite keys. I propose
resolving this by merging (some of) the index functionality with the storage of keys, through
introducing a composite btree/trie structure (e.g. string b-tree) to represent the key, and
for this structure to index into the cell position in the file. This structure can then serve
as both an efficient index and the key data itself. 

If we then offer the option of (possibly automatically decided for you at flush) storing this
either packed into the same file directly prepending the data, or in a separate key file (with
small pages), with an uncompressed page cache we can get good performance for wide rows by
storing it separately and relying on the page cache for CQL row index lookups, whereas storing
it inline will allow very efficient lookups of small rows where index lookups aren't particularly
helpful. This removal of extra data from the index file, however, will allow CASSANDRA-6709
to massively scale up the efficiency of the key cache, whilst also reducing the total disk
footprint of sstables and (most likely) offering better indexing capability in similar space



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message