cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8915) Improve MergeIterator performance
Date Thu, 02 Apr 2015 21:51:54 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393517#comment-14393517
] 

Benedict commented on CASSANDRA-8915:
-------------------------------------

Suggestion: convert each Candidate into a linked-list of colliding iterators, so that consume()
becomes guaranteed O(1). In the event of many equal items, this would incur only linear costs,
and only on push down, rather than logarithmic costs on both push down and advance. This would
particularly help the partition level (sstable/memtable) merge, as we are likely to encounter
the same DecoratedKey many times.

> Improve MergeIterator performance
> ---------------------------------
>
>                 Key: CASSANDRA-8915
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8915
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Branimir Lambov
>            Assignee: Branimir Lambov
>            Priority: Minor
>
> The implementation of {{MergeIterator}} uses a priority queue and applies a pair of {{poll}}+{{add}}
operations for every item in the resulting sequence. This is quite inefficient as {{poll}}
necessarily applies at least {{log N}} comparisons (up to {{2log N}}), and {{add}} often requires
another {{log N}}, for example in the case where the inputs largely don't overlap (where {{N}}
is the number of iterators being merged).
> This can easily be replaced with a simple custom structure that can perform replacement
of the top of the queue in a single step, which will very often complete after a couple of
comparisons and in the worst case scenarios will match the complexity of the current implementation.
> This should significantly improve merge performance for iterators with limited overlap
(e.g. levelled compaction).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message