directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <akaras...@gmail.com>
Subject Re: Subentries handling refactoring
Date Fri, 16 Jul 2010 12:30:40 GMT
I was a bit reticent to cast a veto but with the right community that  
thinks and discusses the problem this is obviously nothing to worry  
about. Thanks for thoroughly considering my position. I wish I had a  
better solution to offer.

Unfortunately this is a really tough problem to solve. I had thought  
of using indicies for subentry references in entries but like u I  
realized this will incur even more overhead.

Best Regards,
Alex

Sent from my iPhone

On Jul 16, 2010, at 9:26 AM, Emmanuel L├ęcharny <elecharny@apache.org>  
wrote:

> On 7/16/10 12:23 AM, Alex Karasulu wrote:
>> On Thu, Jul 15, 2010 at 8:33 PM, Emmanuel Lecharny<elecharny@gmail.com 
>> >wrote:
>>
>>> This is what I will work on in the next few days if nobody objects  
>>> or find
>>> a better algorithm.
>>>
>>>
>> I'm sorry but I have to veto this change. It's not acceptable to be  
>> taxing
>> an LDAP server's read performance when this should be done during  
>> writes or
>> during administrative changes.
> We discussed this option too : paying the price once for an  
> operation done unfrequently is appealing, that's clear.
>
> In a way, it's like if you create an index on an existing  
> attribute : you have to pay the price for doing so after having  
> added millions of entry.
>
> The problem is that with millions of entries, and with a costly  
> operation like a modify (roughly 15 to 20 times slower than a  
> search), it will take ages to process.
>
> Now, can we do better ? And does it slow down the search a lot if we  
> do what I suggest ?
>
> Let's think about the search penalty first (btw, it's not only  
> search, but *all* operation) : we will have to find the subentry  
> associated with an entry if it belongs to an AP. If we have a decent  
> AP cache, this is just a question of looking in this cache for the  
> list of subentries we will have to check, then evaluate these  
> subentries. The cost of searching for subentires is pretty minimal,  
> but evaluating the SS may be costly for *all* the processed entries,  
> even if they are not impacted.
>
> All in all, I think it will add 5 to 10 % penalty to all the  
> operations.
>
> Is it acceptable ? Well, as Alex says, and this is the reason why I  
> posted this mail before going on with such a major change, if you  
> balance the number of operation done daily on a LDAP server and a  
> change in the administrative model (ie, the addition of a subentry),  
> I would say that he is right.
>
> Can we do better than what is currently done ?
>
> We discussed a bit about other options, and one of them would be to  
> use a side index, which will tell if an entry is dependent on an AP.  
> Instead of modifying all the entries, we just update this index. The  
> problem with this approach is that not only you still have  a  
> penalty to pay for each operation (ie, checking in this index to see  
> if the entry is a part of an AP), plus updating this index will be  
> more costly than upating the master table (becuase you have 2 files  
> to update on disk instead of only one).
>
> I currently don't see any other alternative.
>
> So, all in all, I like this veto :)
>
> We still have things to work on :
> - the subentry interceptor has still to be fixed, as we don't manage  
> correctly the moveAndRename operation
> - the ACI interceptor does a direct access to a method in the  
> subentry interceptor, because the subentry cache is managed by this  
> interceptor. I think this cache shuld be moved the the  
> directoryService instance
> - we can imrprove the modify operation performances :)
>
> One side effect of *not* doing such a modification is that I will  
> spare a few days working on some other bugs we have instead of  
> implementing this.
>
> Side note : this again prove that a community is way better than  
> individuals. Sharing ideas openly is the best way to avoid mistakes.  
> Thanks for the feedback Alex !
>
> -- 
> Regards,
> Cordialement,
> Emmanuel L├ęcharny
> www.iktek.com
>

Mime
View raw message