directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lecharny <>
Subject Re: [ApacheDS] Default partition design ideas (was: Re: [ApacheDS] Going to need to implement a splay tree)
Date Sat, 02 Feb 2008 08:53:36 GMT
Howard Chu wrote:
> Alex Karasulu wrote:
>>     Those long must be fetched
>>     quickly, as we will always get an entry by its ID. Using a BTree for
>>     that is time consuming, as fetching an entry will be done in 
>> O(log(N)).
>> You're absolutely right a hash would be much better. We don't need to
>> sort the ID's.
> Way back in the OpenLDAP 2.1 days we used hashes for our indexing in 
> back-bdb. But we found that B-Trees still performed better, even 
> though index lookups have nearly zero locality of reference. The 
> problem is with large DBs, the hash tables grow too large to fit 
> entirely in cache. Once the table grows past a certain size, you can 
> no longer directly reference the records; there's a lot of expensive 
> paging in/out that needs to be done. With a B-Tree, the number of 
> internal pages in the tree is still very small relative to the total 
> number of data pages, so you get a lot of cache reuse referencing 
> those pages. So we switched everything to use B-Trees in OpenLDAP 2.2...
> Hashing is faster *in theory*, but in practice it loses out.
That's a very valid point. The issue is how many IDs can we store in a 
Hash table in memory ? Assuming that you have a correct hash method, 
around 30% of the table will be empty, and the average number of lookup 
will be around 2. a DN is a pretty fat object, (I would say, around 
fifty chars), so storing a million of those guys needs 200 Mbytes of 
memory. Pretty heavy.

I would draw a line somehwere in the middle, depending on the available 
memory and the number of DN we want to manage. Obviously, at some point 
BTree will be faster, just because you can cache the BTree intermediate 
leaves, leading to less paging than with HashTable. But for a small 
number of entries (say, 100K entries), with 512 Mbytes of memory, this 
might be a good idea to use a Hash instead.

The ultimate server would mitigate those parameters to use the correct 
data structure.

Let's be pragmatic :
- using BTree for every elements is a way to handle one single kind of 
data structure instead of two : potentially less bugs
- we won't have to face the perfomance slow down Howard is mentioning if 
we keep BTrees to manage DN, even if it can be slower when handling a 
few DNs. It's more important to have a scalable server.
- One API to play with is easier then 2
- HashTable can degenerate if the hash function is not correctly chosen
- BTrees are using the memory efficiently, Hash spoil 30% of it with 
empty slots
- If the server starts to swap because the hash is not in memory, as the 
key distribution is random, performances will really fall by an order of 
magnitude (may be more)

So I would say : thanks Howard for this very interesting insight ! F*ck 
the hashtable ;)

cordialement, regards,
Emmanuel L├ęcharny

View raw message