incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: Why sequential document ids? [was: Re: What's the speed(performance) of couchdb?]
Date Thu, 26 Feb 2009 19:18:06 GMT

On Feb 26, 2009, at 1:55 PM, Jan Lehnardt wrote:

>
> On 26 Feb 2009, at 19:49, Barry Wark wrote:
>>>
>>> or ascending...
>>
>> As an asside, why is it that sequential document ids would produce a
>> significant performance boost? I suspect the answer is something
>> rather fundamental to CouchDB's design, and I'd like to try to grok
>> it.
>
> b-trees inner-nodes can get cached better if inserts basically always
> use the same path.
>

What he said. It's pretty standard btree stuff, most, if not all the  
major rdbms have similar issues with primary keys.

Also, he Ids don't need to be sequential (1,2,3,4...), just ordered  
(1,5,19,22...). And they don't need to sort higher or lower than all  
the other ids, so long as they are clustered together.  The each btree  
nodes that have to be loaded that isn't in cache is expensive. The  
more the keys have to be inserted into random places in the btree, the  
worse the caching behavior. Right now, with the crypto random UUIDs we  
generate, it's basically the worst case scenario for doc inserts.

-Damien

Mime
View raw message