directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <elecha...@apache.org>
Subject Re: Subentries handling refactoring
Date Fri, 16 Jul 2010 06:26:43 GMT
  On 7/16/10 12:23 AM, Alex Karasulu wrote:
> On Thu, Jul 15, 2010 at 8:33 PM, Emmanuel Lecharny<elecharny@gmail.com>wrote:
>
>> This is what I will work on in the next few days if nobody objects or find
>> a better algorithm.
>>
>>
> I'm sorry but I have to veto this change. It's not acceptable to be taxing
> an LDAP server's read performance when this should be done during writes or
> during administrative changes.
We discussed this option too : paying the price once for an operation 
done unfrequently is appealing, that's clear.

In a way, it's like if you create an index on an existing attribute : 
you have to pay the price for doing so after having added millions of entry.

The problem is that with millions of entries, and with a costly 
operation like a modify (roughly 15 to 20 times slower than a search), 
it will take ages to process.

Now, can we do better ? And does it slow down the search a lot if we do 
what I suggest ?

Let's think about the search penalty first (btw, it's not only search, 
but *all* operation) : we will have to find the subentry associated with 
an entry if it belongs to an AP. If we have a decent AP cache, this is 
just a question of looking in this cache for the list of subentries we 
will have to check, then evaluate these subentries. The cost of 
searching for subentires is pretty minimal, but evaluating the SS may be 
costly for *all* the processed entries, even if they are not impacted.

All in all, I think it will add 5 to 10 % penalty to all the operations.

Is it acceptable ? Well, as Alex says, and this is the reason why I 
posted this mail before going on with such a major change, if you 
balance the number of operation done daily on a LDAP server and a change 
in the administrative model (ie, the addition of a subentry), I would 
say that he is right.

Can we do better than what is currently done ?

We discussed a bit about other options, and one of them would be to use 
a side index, which will tell if an entry is dependent on an AP. Instead 
of modifying all the entries, we just update this index. The problem 
with this approach is that not only you still have  a penalty to pay for 
each operation (ie, checking in this index to see if the entry is a part 
of an AP), plus updating this index will be more costly than upating the 
master table (becuase you have 2 files to update on disk instead of only 
one).

I currently don't see any other alternative.

So, all in all, I like this veto :)

We still have things to work on :
- the subentry interceptor has still to be fixed, as we don't manage 
correctly the moveAndRename operation
- the ACI interceptor does a direct access to a method in the subentry 
interceptor, because the subentry cache is managed by this interceptor. 
I think this cache shuld be moved the the directoryService instance
- we can imrprove the modify operation performances :)

One side effect of *not* doing such a modification is that I will spare 
a few days working on some other bugs we have instead of implementing this.

Side note : this again prove that a community is way better than 
individuals. Sharing ideas openly is the best way to avoid mistakes. 
Thanks for the feedback Alex !

-- 
Regards,
Cordialement,
Emmanuel L├ęcharny
www.iktek.com


Mime
View raw message