cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (CASSANDRA-4011) range-based log(n) elimination of sstables in read path
Date Fri, 25 Jan 2013 23:51:13 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-4011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jonathan Ellis resolved CASSANDRA-4011.
---------------------------------------

    Resolution: Not A Problem
    
> range-based log(n) elimination of sstables in read path
> -------------------------------------------------------
>
>                 Key: CASSANDRA-4011
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4011
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Peter Schuller
>
> If the read path was able to eliminate sstables based on token ranges, we would avoid
{{O(n)}} bloom filter checks ({{n}} being number of sstables).
> Contributing motivation:
> * For maximally efficient bulk-import, you tend to want a lot of small sstables to avoid
having to build up huge ones during the bulk creation process.
> * To avoid having to keep duplicate data when switching a data set (in a periodic bulk
replace import process), keeping sstables partitioned on token range (similarly to leveled
compaction) allows in-place replacement of sstables one sstable at a time.
> Those two in combination would mean that you can run a bulk-import based total-dataset-replacement
cluster with zero compaction and with zero disk space overhead stemming from having to have
overhead for compaction.
> In addition:
> * For e.g. leveled compaction where we have range based partitioning anyway, {{log(n)}}
is preferable to {{o(n)}}; especially if it would allow us to have more than 10 "partitions"
per level. I'm not sure yet whether there are other reasons to have "only" 10, but if we can
make them smaller by eliminating the {{o(n)}} behavior in the read path, individual compactions
can be even smaller with leveled and you would scale even more easily with large data sets
while avoiding build-up in L0.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message