directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <>
Subject Re: Subentries handling refactoring
Date Thu, 15 Jul 2010 22:23:47 GMT
On Thu, Jul 15, 2010 at 8:33 PM, Emmanuel Lecharny <>wrote:

>  Hi guys,
> we have serious issues with the way we manage subentries in the server. Not
> that it's not working, but it's certainly not good enough for anything but a
> toy server.
I disagree. Adding a subentry is an administrative operation. If you do so
you better be prepared to pay the price if you've done so after adding
millions of entries as you point out. This is why we add subentries first
then populate the server.

The idea is that the cost is paid not during search time but during
administrative changes and writes.

> Let me first give some heads up about what's going on.
> A subentry is associated with an AdministrativePoint (AP), and defines a
> selection of entries which will be affected depending on the AP role. Those
> roles are :
> - Access Control
> - Collective Attributes
> - SubSchema (not active atm)
> - Triggers (ADS specific).
> For instance, if we have a tree with a set of entries associated with a
> location (ie, c=France), we may define a subentry with a Collective
> Attribute role telling the server that every entry under the c=France branch
> will have a specific attribute added. We don't have then to add this
> attribute to *every* entry in this branch...
> Anyway...
> A subentry defines a selection using a filter, and a base DN for this
> filter to be active from.
> Right now, a Subentry is attached to an AP as a (quite) normal entry, and
> when we add this subentry, we modidy *all* the selected entries (using the
> subentry filter and the base DN) will be modified to have a new attribute
> added. This added attribute contains a DN poiting to the associated
> subentry, so that when we process this entry, we can immediately know that
> it's associated with an AP.
> So far, so good : processing an entry is fast, as we have all what we need
> when we have the entry. But the dark side is that if we have millions of
> entries, when we add an AP and a subentry, we may have to modify potentially
> *millions* of entries to add this attribute. Not good...
> How can we improve this process ?
> The idea would be to search for the APs when we process an entry, but it
> has to be fast. How can we do that ? Simple : we use the entry's DN and
> using a DN cache, we can get all the APs associated with an entry knowing
> its DN. It's as costly as the depth of the entry's DN. Once we have grabbed
> the APs, we will have to evaluate the entry to know if it's part of the
> selections defined by the APs' subentry. Done.
> Is it costly ? Only marginaly compared to the current algorithm, as we have
> to lookup for the AP, when we have this list in an Attribute in the current
> server. But we spare the big modifications when adding - or
> removing/renaming/moving - a subentry.
> What we just need is a APs cache and a way to process it.
> This is what I will work on in the next few days if nobody objects or find
> a better algorithm.
I'm sorry but I have to veto this change. It's not acceptable to be taxing
an LDAP server's read performance when this should be done during writes or
during administrative changes.

There's a reason why I implemented it this way 5 years ago. I still think
it's sound. It's a matter of when you choose to pay for the administrative
change which BTW should not happen that often while reads happen all the
time. I'd rather pay a lot for infrequent operations than a little bit for
the most common operations.

Alex Karasulu
My Blog ::
Apache Directory Server ::
Apache MINA ::
To set up a meeting with me:

View raw message