directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefan Seelmann <seelm...@apache.org>
Subject Re: Alias handling issues
Date Wed, 15 Jun 2011 17:11:20 GMT
On Wed, Jun 15, 2011 at 7:01 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:
> On 6/15/11 6:17 PM, Pierre-Arnaud Marcelot wrote:
>>
>> On 15 juin 2011, at 18:01, Emmanuel Lecharny wrote:
>>
>>> On 6/15/11 5:19 PM, Emmanuel Lecharny wrote:
>>>>
>>>> Hi,
>>>>
>>>> sadly, things are not as rosy as I exepected...
>>>>
>>>> I have written a few tests to check if we correctly detect cycles while
>>>> processing a search, and at some point, I found that cycle are not an issue
>>>> at all. But wait, it's not a good news :/
>>>>
>>>> Let's say you have the following entries :
>>>>
>>>> cn=test
>>>>  cn=foo,cn=test
>>>>    cn=barAlias,cn=foo,cn=test ->  cn=bar,cn=test
>>>>  cn=bar,cn=test
>>>>    cn=dohAlias,cn=foo,cn=test ->  cn=doh,cn=test
>>>>  cn=doh,cn=test
>>>>
>>>> Logically, doing a SUBTREE search on cn=foo,cn=test, you should get the
>>>> following entries :
>>>> cn=foo,cn=test
>>>> cn=bar,cn=test
>>>> cn=doh,cn=test
>>>>
>>>> This is *not* what we get. Currently, you'll have :
>>>> cn=foo,cn=test
>>>> cn=bar,cn=test
>>>>
>>>> The second alias dereferencing is never done.
>>>>
>>>> Obviously, it helps when it come sto avoid cycle, but this is certainly
>>>> not the expected behavior.
>>>>
>>>> Now, if we want to do alias chasing on the server, we will have to
>>>> modify the way we handle alias : each one of them will issue a new search,
>>>> wth a new cursor.
>>>>
>>>> Hopefully, stacking the aliases to be processed works well with the fact
>>>> that we have to stack the aliases for cycle detection. One possible solution
>>>> would be to process this stack alias after alias, and if we get back an
>>>> alias, we add it in the stack if it's not already present (otherwise, that
>>>> means we have a cycle).
>>>>
>>>> In our sample, we will have the following stack :
>>>> () : empty stack
>>>> (cn=barAlias,cn=foo,cn=test) : first alias met
>>>> ->  here, we have processed all the entries for the initial search
>>>>  [cn=foo,cn=test]
>>>>  [cn=bar,cn=test] (the dereferenced alias)
>>>> ->  now, we get the leftmost element in the stack, and launch a new
>>>> search
>>>> (<cn=barAlias,cn=foo,cn=test>) : this alias is being processed, thus
>>>> the<>
>>>> (<cn=barAlias,cn=foo,cn=test>, cn=dohAlias,cn=foo,cn=test) : second
>>>> alias met
>>>> ->  again, all the entries have been processed, we take the next alias
>>>> in the stack
>>>>  [cn=doh,cn=test]
>>>> (<cn=barAlias,cn=foo,cn=test>,<cn=dohAlias,cn=foo,cn=test>) :
second
>>>> alias met
>>>> ->  there are no further entries. We are done
>>>>
>>>> Seems to work... Did I miss something ?
>>>
>>> Yes, I missed something (thanks Pierre-Arnaud for pointing this out) :
>>>
>>> if the alias get dereferenced to an entry below one of the DN already
>>> stored in the stack, or to the descendant of one of the stored DNs, then we
>>> don't need to proceed the search for this alias, as the entries have already
>>> been provided. That will avoid sending duplicate entries to the user.
>>
>> Yeah, a basic example would be to execute the same search as above but one
>> level higher, from "cn=test".
>>
>> In that case, the first search will already give us the three resulting
>> entries:
>> - cn=foo,cn=test
>> - cn=bar,cn=test
>> - cn=doh,cn=test
>>
>> And then, in the other searches, triggered when following aliases, some of
>> these entries would match again. For instance:
>> - cn=bar,cn=test
>> - cn=doh,cn=test
>>
>> We have to make sure that we don't send them multiple times to the user.
>> Maybe by storing a list of all the DNs we already returned to the user...
>> But that might be a little bit overkill and suck too much memory.
>
> By storing the root DN of each consecutive search, we will avoid such
> problem.
>
> In this case, the initial search is done with cn=foo,cn=test, and this is
> the stored DN.
> The first search will just returns the cn=foo,cn=test entry and the
> cn=barAlias,cn=foo,cn=test entry, which will be dereferenced.
>
> This dereferemced operation consists on doing a search on the aliasedObject
> stored in the alias. As it's a new search, we store the base DN
> (cn=bar,cn=test), and we launch the new search. Etc...
>
> Should work pretty well without having to store all the entries' DNs

Yes, that stored DNs help for both, (1) avoiding duplicates and (2)
avoiding cycles. But remember, for millions of aliases you have to
store millions of DNs ;-)

Kind Regards,
Stefan

Mime
View raw message