directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <elecha...@apache.org>
Subject Re: Alias handling issues
Date Wed, 15 Jun 2011 17:23:50 GMT
On 6/15/11 7:11 PM, Stefan Seelmann wrote:
> On Wed, Jun 15, 2011 at 7:01 PM, Emmanuel Lecharny<elecharny@gmail.com>  wrote:
>> On 6/15/11 6:17 PM, Pierre-Arnaud Marcelot wrote:
>>> On 15 juin 2011, at 18:01, Emmanuel Lecharny wrote:
>>>
>>>> On 6/15/11 5:19 PM, Emmanuel Lecharny wrote:
>>>>> Hi,
>>>>>
>>>>> sadly, things are not as rosy as I exepected...
>>>>>
>>>>> I have written a few tests to check if we correctly detect cycles while
>>>>> processing a search, and at some point, I found that cycle are not an
issue
>>>>> at all. But wait, it's not a good news :/
>>>>>
>>>>> Let's say you have the following entries :
>>>>>
>>>>> cn=test
>>>>>   cn=foo,cn=test
>>>>>     cn=barAlias,cn=foo,cn=test ->    cn=bar,cn=test
>>>>>   cn=bar,cn=test
>>>>>     cn=dohAlias,cn=foo,cn=test ->    cn=doh,cn=test
>>>>>   cn=doh,cn=test
>>>>>
>>>>> Logically, doing a SUBTREE search on cn=foo,cn=test, you should get the
>>>>> following entries :
>>>>> cn=foo,cn=test
>>>>> cn=bar,cn=test
>>>>> cn=doh,cn=test
>>>>>
>>>>> This is *not* what we get. Currently, you'll have :
>>>>> cn=foo,cn=test
>>>>> cn=bar,cn=test
>>>>>
>>>>> The second alias dereferencing is never done.
>>>>>
>>>>> Obviously, it helps when it come sto avoid cycle, but this is certainly
>>>>> not the expected behavior.
>>>>>
>>>>> Now, if we want to do alias chasing on the server, we will have to
>>>>> modify the way we handle alias : each one of them will issue a new search,
>>>>> wth a new cursor.
>>>>>
>>>>> Hopefully, stacking the aliases to be processed works well with the fact
>>>>> that we have to stack the aliases for cycle detection. One possible solution
>>>>> would be to process this stack alias after alias, and if we get back
an
>>>>> alias, we add it in the stack if it's not already present (otherwise,
that
>>>>> means we have a cycle).
>>>>>
>>>>> In our sample, we will have the following stack :
>>>>> () : empty stack
>>>>> (cn=barAlias,cn=foo,cn=test) : first alias met
>>>>> ->    here, we have processed all the entries for the initial search
>>>>>   [cn=foo,cn=test]
>>>>>   [cn=bar,cn=test] (the dereferenced alias)
>>>>> ->    now, we get the leftmost element in the stack, and launch a
new
>>>>> search
>>>>> (<cn=barAlias,cn=foo,cn=test>) : this alias is being processed,
thus
>>>>> the<>
>>>>> (<cn=barAlias,cn=foo,cn=test>, cn=dohAlias,cn=foo,cn=test) : second
>>>>> alias met
>>>>> ->    again, all the entries have been processed, we take the next
alias
>>>>> in the stack
>>>>>   [cn=doh,cn=test]
>>>>> (<cn=barAlias,cn=foo,cn=test>,<cn=dohAlias,cn=foo,cn=test>)
: second
>>>>> alias met
>>>>> ->    there are no further entries. We are done
>>>>>
>>>>> Seems to work... Did I miss something ?
>>>> Yes, I missed something (thanks Pierre-Arnaud for pointing this out) :
>>>>
>>>> if the alias get dereferenced to an entry below one of the DN already
>>>> stored in the stack, or to the descendant of one of the stored DNs, then
we
>>>> don't need to proceed the search for this alias, as the entries have already
>>>> been provided. That will avoid sending duplicate entries to the user.
>>> Yeah, a basic example would be to execute the same search as above but one
>>> level higher, from "cn=test".
>>>
>>> In that case, the first search will already give us the three resulting
>>> entries:
>>> - cn=foo,cn=test
>>> - cn=bar,cn=test
>>> - cn=doh,cn=test
>>>
>>> And then, in the other searches, triggered when following aliases, some of
>>> these entries would match again. For instance:
>>> - cn=bar,cn=test
>>> - cn=doh,cn=test
>>>
>>> We have to make sure that we don't send them multiple times to the user.
>>> Maybe by storing a list of all the DNs we already returned to the user...
>>> But that might be a little bit overkill and suck too much memory.
>> By storing the root DN of each consecutive search, we will avoid such
>> problem.
>>
>> In this case, the initial search is done with cn=foo,cn=test, and this is
>> the stored DN.
>> The first search will just returns the cn=foo,cn=test entry and the
>> cn=barAlias,cn=foo,cn=test entry, which will be dereferenced.
>>
>> This dereferemced operation consists on doing a search on the aliasedObject
>> stored in the alias. As it's a new search, we store the base DN
>> (cn=bar,cn=test), and we launch the new search. Etc...
>>
>> Should work pretty well without having to store all the entries' DNs
> Yes, that stored DNs help for both, (1) avoiding duplicates and (2)
> avoiding cycles. But remember, for millions of aliases you have to
> store millions of DNs ;-)
Too bad for the millions of aliases :)

Those using too many aliases may :
- either have a very badly designed database
- or managing the CIA employees...

-- 
Regards,
Cordialement,
Emmanuel L├ęcharny
www.iktek.com


Mime
View raw message