directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Howard Chu <>
Subject Re: RDN index, oneLevel and sublevel index merge
Date Sat, 08 Oct 2011 10:22:25 GMT
Howard Chu wrote:
> Emmanuel Lecharny wrote:
>> Hi guys,
>> Stefan started to modify the code to get rid of the oneLevel and
>> subLevel index, which are more or less useless as we already have the
>> hierarchy stored into the rdn index.
>> However, this rdn index is not good enough as is to be use as a
>> replacement for the two other indexes. Its structure forbid us to easily
>> retrieve the children from a known entry.
>> The current RDN index structure is :
>> <parentId, RDN>   ->   Entry ID
>> The key is a tuple containing the parent ID to be able to rebuild the DN.
>> The reverse index is :
>> Entry ID ->   <parentId, RDN>
>> We don't have duplicated values.
>> Now, when we have an entry ID, there is no simple way to get the list of
>> all the children for this entry.
>> We will have to add a third index to deal with such searches :
>> ParentId ->   <entryId, ....>
>> which will list all the children of a specific entry.
>> I'm going to investigate around this idea i the next few days.
>> Thoughts ?
> Currently in OpenLDAP back-hdb and back-mdb, the DN index contains
> Entry ID ->  <parentID, RDN>  [,<child ID, RDN>  ...]
> So each entry's RDN is stored twice, once under its own entryID, and once
> under its parent's entryID. This allows top-down DN to ID lookups and
> bottom-up ID to DN lookups from a single index.
> (This is a bit different from the original layout described in 2003
> )
I forgot to mention... Onelevel lookups are obviously trivial with this 
scheme. back-hdb and back-mdb differ in how they perform subordinate lookups. 
back-hdb recursively builds a list of subordinate members (and caches this 
list for future queries). back-hdb is ridiculously slow for subordinate 
lookups that haven't been cached yet.

back-mdb is a bit more straightforward; given a list of candidate entry IDs, 
it simply walks up from the entryID until it hits the base of the desired 
subtree (meaning candidate is present in the result set) or until it hits the 
root of the tree (candidate is not present). As a further refinement (not yet 
implemented) if a search yields an all-entries candidate list (unindexed 
search), it can just recursively descend from the base of the desired subtree 
and ignore the candidate list.

It turns out that building the cached IDLs in back-hdb is much slower than 
just reading the index in back-mdb. (Not to mention being a memory pig.)

   -- Howard Chu
   CTO, Symas Corp. 
   Director, Highland Sun
   Chief Architect, OpenLDAP

View raw message