cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-2062) Use more efficient merge algorithm
Date Thu, 27 Jan 2011 09:16:34 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-2062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stu Hood updated CASSANDRA-2062:
--------------------------------

    Attachment: 0005-Remove-temporary-instrumentation-and-CollatingIterator.txt
                0004-Port-LazilyCompactedRow-ReducingKeyIterator-RangeSlice.txt

* 0001 - Adds MergeIterator, which uses a priority queue to perform a {{O(2*log(M)*N)}} merge
in the same space. Also guarantees to only call hasNext once per item.
* 0002 - Adds some debug output for checking the number of comparisons performed during compaction
* 0003 - Replaces CollatingIterator for Compaction
* 0004 - Replaces CollatingIterator elsewhere
* 0005 - Removes debug output and CollatingIterator from FBUtilities

The number of comparisons needed to compact {{M=sstables=100}}, {{N=rows=80000}} drops from
~8 million ({{M*N}}) to ~1 million ({{2*log(M)*N}}), as expected.
{noformat}# MergeIterator
1 for org.apache.cassandra.io.CompactionIterator@7dc21ece
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=1 colsper=200000: 2758
ms
399999 for org.apache.cassandra.io.CompactionIterator@20ca5bff
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=200000 colsper=1: 4504
ms
917546 for org.apache.cassandra.io.CompactionIterator@6360f5bf
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=100 rowsper=800 colsper=5: 1021
ms


# CollatingIterator
1 for org.apache.cassandra.io.CompactionIterator@5f873eb2
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=1 colsper=200000: 1732
ms
399999 for org.apache.cassandra.io.CompactionIterator@357c7988
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=2 rowsper=200000 colsper=1: 4499
ms
7915050 for org.apache.cassandra.io.CompactionIterator@2ae0420b
org.apache.cassandra.db.LongCompactionSpeedTest: sstables=100 rowsper=800 colsper=5: 1427
ms{noformat}

> Use more efficient merge algorithm
> ----------------------------------
>
>                 Key: CASSANDRA-2062
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2062
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Priority: Minor
>             Fix For: 0.7.2
>
>         Attachments: 0001-Improved-iterator-for-merging-sorted-iterators.txt, 0002-Quickie-instrumentation-for-comparisons.txt,
0003-Replace-Collating-with-Merge-in-CompactionIterator.txt, 0004-Port-LazilyCompactedRow-ReducingKeyIterator-RangeSlice.txt,
0005-Remove-temporary-instrumentation-and-CollatingIterator.txt
>
>
> For {{M}} iterators containing {{N}} total items, commons.collections.CollatingIterator
performs a {{O(M*N)}} merge, and calls hasNext multiple times per returned value. We can do
better.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message