couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bob Dionne <dio...@dionne-associates.com>
Subject Re: Fundamentals Question on CoucheDB's append only b+tree.
Date Wed, 14 Mar 2012 08:25:14 GMT
the literature is not quite clear on this point[1].

Also, to Paul's earlier point, yes, technically couchdb's btree is a b+tree as the reductions
aren't mapped to the keys, though this impacts the ability to get as many keys as possible
into RAM with each read.

couch_btree is a pretty clever piece of erlang. The order (as defined in Knuth's sense[1])
is determined by a chunk size. In some cases you can improve performance by imposing a minimum
but by and large no one to date has demonstrated any major improvements in it. Which is fine
as that's probably not where the bottlenecks are.

I think storing the reductions with the inner nodes is a place where an optimization could
be performed. The problem being that they are essential to the incremental feature of MR.


[1] http://en.wikipedia.org/wiki/B-tree


On Mar 14, 2012, at 12:37 AM, Paul Davis wrote:

> On Tue, Mar 13, 2012 at 11:33 PM, Randall Leeds <randall.leeds@gmail.com> wrote:
>> On Mar 13, 2012 7:11 PM, "Paul Davis" <paul.joseph.davis@gmail.com> wrote:
>>> 
>>> On Tue, Mar 13, 2012 at 5:50 PM, Bob Dionne
>>> <dionne@dionne-associates.com> wrote:
>>>> That's largely correct, though couchdb's btree is neither fish nor fowl.
>>>> 
>>>> It doesn't have an order (B-trees generally do) and it does store
>> values in the inner nodes (B+trees do not)
>>>> 
>>> 
>>> The lack of order is the big thing.
>> 
>> What is meant by order in this context?
> 
> Maximum number of elements in a node. Ie, the whole "every node except
> the root must have between N/2 and N elements at all times" thing.


Mime
View raw message