directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <>
Subject Re: Subentry cache : one step further
Date Mon, 03 Jan 2011 01:39:14 GMT
On Mon, Jan 3, 2011 at 2:58 AM, Emmanuel L├ęcharny <>wrote:

> On 1/3/11 1:38 AM, Alex Karasulu wrote:
>> On Mon, Jan 3, 2011 at 2:09 AM, Emmanuel L├ęcharny<
>> >wrote:
>>  On 1/3/11 12:57 AM, Alex Karasulu wrote:
>>>  On Mon, Jan 3, 2011 at 1:27 AM, Emmanuel Lecharny<
>>>>> wrote:
>>>>  Hi,
>>>>>  SNIP
>>>>  (I still have in mind to add an optional computation of the entries
>>>> when
>>>>> an
>>>>> AP or a Subentry are modified, to avoid a postponed evaluation).
>>>>>  Could you elaborate on this? I did not quite understand what you mean
>>>>> by
>>>> "an
>>>> optional computation of the entries".
>>>>  We have three options here :
>>> - the current trunk implementation modifies the impacted entries
>>> immediately when a Subentry is added/removed/modified (using the
>>> SubtreeSpecification). It's costly, but only when we add/remove/modify a
>>> subentry.
>>> - the current branch I'm working on is using a differed computation, ie
>>> the
>>> entry relation to subentries is compted the first time the entry is
>>> accessed
>>> (either during an addition or a modification, or when read). That means
>>> the
>>> first read of an entry will imply a write on disk, the next read will be
>>> as
>>> fast as a normal read. OTOH, the first read of an entry is always costly,
>>> as
>>> we have to read the entry from the disk (unless it's in cache).
>>> - the third option, if we don't want to impact users when adding a
>>> subentry
>>> when the server is running, is to do as it's done in trunk, ie update all
>>> the entries when adding a subentry. But this would be an option that can
>>> be
>>> activated on the fly (by modifying th server configuration, or by sending
>>> a
>>> control with the subentry operation).
>>> I suggest we go for option #2 atm, assuming that implementing #3 is easy
>>> and won't imply a huge refactoring, as the mechanisms used to update the
>>> entries is already implemented.
>>>  It's up to you but IMO I don't think this option of delaying updates
>> with
>> subentry changes is really worth the complexity and it also introduces
>> other
>> serious issues. I wanted to express this thought but you seemed very
>> interested in this direction so I let it be.
> In fact, the complexity is equivalent. You still have to update all the
> entries, which is pretty trivial. The only difference is that when you grab
> the entry, you have to check if it has a reference on a subentry's UUID and
> the same sequence number than its parent AP. But if we want to spare this
> extra processing, then option #3 can be triggered.
> OTOH, computing all the entries while we process a subentry
> addition/removal is quite complex, as we have no way to correctly handle a
> server shutdown occurring in the middle of such an operation. One more thing
> : as the operation will be costly, it's unlikely to be atomic.
>  Just as a quick idea of what I was thinking. Sometimes a search with the
>> right parameters pursuant to a subentry alteration affecting the selected
>> region may trigger the entire area to be computed anyway making the lazy
>> computation effectively the same thing as eager computation. But this time
>> the computation effort is felt on a search operation. This will make our
>> performance metrics tests even harder to interpret down the line as well.
> This is why I suggested in one previous mail I sent two weeks ago that if
> the admin does not want to face such impact, it's easy to do a full search
> and get the full base updated immediately.
>  Furthermore don't we want to know if a subentry altering operation
>> succeeds
>> when the administrator performs it? It might be nice to have the operation
>> block as well so the admin knows when the operation completes so he can
>> let
>> users back on the system.
> There is no need to block with option #2, as the admin will know that the
> operation has succeeded as soon as he get the response.

My point is he does not know that all the entries impacted by the admin op
have been massaged with #2. With #1 you know everything has been updated
when the operation returns even if it takes a long time.

> We then have all the needed informations stored in the server to process
> any other operation leveraging the added Subentry :
> - the AP is present
> - the seqNumbers are updated in the AP
> - the subentry is present
> - the subentry caches are updated
>  Also when we  get local transactions implemented
>> in the server subentry alterations should be tied to a single atomic
>> operation. If something fails for some reason or another down the line
>> while
>> making the updates don't you want to know immediately and roll back?
> Someting clearly straightforward with option #2, way more complicated if
> you have thousands of entries to rollback with option #1. In fact, I don't
> want to think what impact it could have with millions of entries being parts
> of a subtreeSpecification...
> Keep in mind that with the differed computation, knowing if the entry is
> associated with a subentry - or not- is just a matter of comparing the
> seqNumbers the entry has (or not) with it's AP sesqNumber, and if it's
> older, then check if the entry is part of one subtree, and update its
> references to the subentry. All those operations are done in memory, and
> don't require any disk access, except to store the updated attributes. And
> even so, if we don't write back to the disk this updated entry, it doesn't
> matter. If the server brutally stops, we simply will recompute those
> elements the next time.
> All in all, what bothers me the most with option #1 is the failure recovery
> : we don't have any mechanism to restart the processing in the middle of it.
> This is not an issue with #2.
Yes I must admit that this is a huge advantage with #2 and extremely

What's giving me an icky feeling inside is that #2 is making a read
operation induce writes (although DSA driven maintenance operations) and
hence causing us to consider one offs with the way a change log works. It's
also something to be considered for replication as a change to be ignored.

My whole religious issue with approach #2 is that it's essential an
optimization for an administrative operation that is forcing us to settle
for this path because of a lack of solid atomicity based on local
transactions in the server. We're shifting many ideals we all believe in if
I am not mistaken to optimize a seldom performed operation that usually
occurs during outage times.

Alex Karasulu
My Blog ::
Apache Directory Server ::
Apache MINA ::
To set up a meeting with me:

View raw message