couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Barry Wark <>
Subject Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]
Date Fri, 27 Feb 2009 17:14:45 GMT
On Thu, Feb 26, 2009 at 3:54 PM, Chris Anderson <> wrote:
> On Thu, Feb 26, 2009 at 3:31 PM, Paul Davis <> wrote:
>> Not sure i follow what you mean by balance issues. Seeing as its a
>> btree it can't be unbalanced because if it were it wouldn't be a
>> btree. But then our definitions of balance might be different. :D
> When a B-tree is consistently inserted into, with a new highest (or
> lowest) value, you can end up with a deeper tree than when it is
> randomly inserted into. Imagine the inner-nodes all accumulating on
> the right side of the tree, so in the worst case you can end up with
> one lonely leaf node to the left of the root, and essentially a linked
> list of inner-nodes on the right side (such that cost would approach
> O(N) for inserts and lookups.

Thank you for clarifying my intent with a much better statement. This
is what I meant by "unbalanced"

> My benchmarks show that inserts do not slow down as you would expect
> them to in this worst case, so Couch is probably doing something
> non-naive to rebalance the tree even when inserts are always at one
> extreme.

Yes, Pauls response shows that this is the case.

>> The only performance difference between reading random and the
>> sequential document ids I can think of would favor the sequential side
>> when you're reading documents in roughly the same order as their ids
>> so we get FS cache hits.
> From a practical standpoint, you're probably right about this
> (assuming the tree remains shallow even with inserts consistently at
> one edge of the tree.)
> Chris
> --
> Chris Anderson

View raw message