[ https://issues.apache.org/jira/browse/CASSANDRA6933?page=com.atlassian.jira.plugin.system.issuetabpanels:commenttabpanel&focusedCommentId=13948731#comment13948731
]
Benedict edited comment on CASSANDRA6933 at 3/27/14 1:34 AM:

Okay, so with a little wikipedia and wolfram alpha help, we have:
lg N, lg 0.5N, lg 0.25N = sum lg2 (2 ^ m) for m in 1..lg2(N), _which is the same as_ sum(1..lg2(N)),
which 0.5(lg2(N))(1+lg2(N))
Whereas lg 1 + lg 2 + lg 3 ... + lg N _is actually_ sum (lg n) for n in 1..N, which is (lg
(1*2*3*4*5...*N)) / (lg 2)
So, for N = 100 we get an extra 1205 comparisons vs an extra 25.
... and I'll leave it there. if I've still messed it up, it's close enough!
was (Author: benedict):
Okay, so with a little wikipedia and wolfram alpha help, we have:
lg N, lg 0.5N, lg 0.25N = sum lg2 (2 ^ m) for m in 1..sqrt(N), _which is the same as_ sum(1..sqrt(N)),
which is sqrt(N).(sqrt(N)+1)/2. So extra comparisons are 0.5(N+sqrt(N))
Whereas lg 1 + lg 2 + lg 3 ... + lg N _is actually_ sum (lg n) for n in 1..N, which is (lg
(1*2*3*4*5...*N)) / (lg 2)
So, for N = 100 we get an extra 1205 comparisons vs an extra 55.
> Optimise Read Comparison Costs in collectTimeOrderedData
> 
>
> Key: CASSANDRA6933
> URL: https://issues.apache.org/jira/browse/CASSANDRA6933
> Project: Cassandra
> Issue Type: Improvement
> Components: Core
> Reporter: Benedict
> Assignee: Benedict
> Priority: Minor
> Labels: performance
> Fix For: 2.1
>
>
> Introduce a new SearchIterator construct, which can be obtained from a ColumnFamily,
which permits efficiently iterating a subset of the cells in ascending order. Essentially,
it saves the previously visited position and searches from there, but also tries to avoid
searching the whole remaining space if possible.

This message was sent by Atlassian JIRA
(v6.2#6252)
