cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jay Zhuang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-9988) Introduce leaf-only iterator
Date Thu, 27 Jul 2017 04:58:02 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-9988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16102704#comment-16102704
] 

Jay Zhuang edited comment on CASSANDRA-9988 at 7/27/17 4:57 AM:
----------------------------------------------------------------

Hi [~benedict], re-read [your comments 1.5 years' ago|https://issues.apache.org/jira/browse/CASSANDRA-9988?focusedCommentId=15117219&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15117219],
now I see your point:
{quote}
2. We could probably get uniformly better behaviour by implementing exponential search for
the leaf iterator; this would bring our total number of comparisons down from n lg n to n
when searching *an approximately equal set of items*, without harming our best case complexity.
{quote}
In that case, {{i}} would be very small ({{i}} is where the target is), so if i = 1, expSearch
({{log( i ) + log( i )}}) just needs to do 2 comparisons. vs. binarySearch 5 comparisons:
log(32) = 5 in worst case. Now the question is how we should define the test: How many re-seeks
should we do? How we choose next target? If we randomly select the next target, it's unfair
to expSearch, if we always select next value (basically use search to iterator the tree) is
unfair to binSearch. It really depends on the user scenario. Do you have any suggestion?


was (Author: jay.zhuang):
Hi [~benedict], re-read [your comments 1.5 years' ago|https://issues.apache.org/jira/browse/CASSANDRA-9988?focusedCommentId=15117219&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15117219],
now I see your point:
{quote}
2. We could probably get uniformly better behaviour by implementing exponential search for
the leaf iterator; this would bring our total number of comparisons down from n lg n to n
when searching *an approximately equal set of items*, without harming our best case complexity.
{quote}
In that case, {{i}} would be very small ({{i}} is where the target is), so expSearch ({{log(
i ) + log( i )}}) just needs to do 2 comparisons. vs. binarySearch 5 comparisons: log(32)
= 5 in worst case. Now the question is how we should define the test: How many re-seeks should
we do? How we choose next target? If we randomly select the next target, it's unfair to expSearch,
if we always select next value (basically use search to iterator the tree) is unfair to binSearch.
It really depends on the user scenario. Do you have any suggestion?

> Introduce leaf-only iterator
> ----------------------------
>
>                 Key: CASSANDRA-9988
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9988
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Benedict
>            Assignee: Jay Zhuang
>            Priority: Minor
>              Labels: patch
>             Fix For: 4.0
>
>         Attachments: 9988-data.png, 9988-result2.png, 9988-result3.png, 9988-result.png,
9988-test-result3.png, 9988-test-result-expsearch.xlsx, 9988-test-result-raw.png, 9988-test-result.xlsx,
9988-trunk-new.txt, 9988-trunk-new-update.txt, trunk-9988.txt
>
>
> In many cases we have small btrees, small enough to fit in a single leaf page. In this
case it _may_ be more efficient to specialise our iterator.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message