On Tue, Dec 16, 2008 at 10:47 AM, Emmanuel Lecharny <elecharny@gmail.com> wrote:
Hi guys,

after the paged search control, I'm trying to clean some issue we have with Aliases. This raise some questions I have about the way we handle aliases.

We should not have "issues" with aliases but there are restrictions on how you can use them.  Basically there are 3 specialized system indices used by the server to manage searches when alias dereferencing is enabled.  These indices as you know are partition specific since the ID used for an entry is a per partition identifier.  So what this means is the id cannot be used across partitions which means aliases can only be constructed with DN's to entries in the same partition as the alias is located in.

This is the only limitation of using aliases in ApacheDS. 

First, this is not a common features ( It's not mandatory to handle aliases in a LDAP server, and currently, AFAIK, Opends, SunDS and AD don't support aliases, while OpenLDAP does ).

Aliases suck for the search implementation but I think we have an elegant and inexpensive way to deal with them.

We are supporting Aliases, to a certain extend. But it may be :
- not complete
- and inefficient

I don't know about the inefficient point since I have not tested it extensively but the alias handling mechanism is as good as I can find based on earlier research documented on this problem for directory servers.  I do know that the feature is complete but has the one limitation presented above.  This limitation can be lifted however and the implementation can be simplified but serious architectural shifts with the partition subsystem are need.  Namely the id property will need to be exposed.  We discussed this before with using some bits of a long id as a partition id and the rest as the entry id.  Remember? 1st 8 bits are for partition id, and remaining 48 will be for the entry?

We are not going to address #2, as first I haven't made any measure to see if it's efficient or not, and #2, it's not time for such improvement. So my questions will be mostly about #1.

First, we have an open issue since 1.0 version : DIRSERVER-803 (https://issues.apache.org/jira/browse/DIRSERVER-803). This is a special case : we create an alias which is linked to an ancestor. Questions :
1) Should we allow such aliases ? Currently, it's rejected and seen as a loop.

This is another subtle limitation due again to how we have implemented it.  You kind of have to understand how the search algorithm is designed to handle the search when alias dereferencing is enabled while searching.  We can discuss these details but it will be a lengthy conversation which we might want to take offline.  Much of this might actually be documented too but I have to check.

2) If we allow such aliases, how should we detect that we are not looping when doing a subtree search ?

For now I would not bother and keep the restriction which rejects such constructs.

3) What about cross referencing alias (ie alias A refers to alias B and vice versa) ?

This is different I guess. This case represents alias chaining: A->B->Non-Alias Entry.  The original question was about an alias referring to an ancestor which causes some issues for search.  Or at least it used to and perhaps now with the changes we've made it may no longer be the case but someone needs to evaluate the impact and determine if this limitation should be lifted.

We have options, as we maintain a cache internally with all the entries which are not aliases ( a bit like what we do with referrals), but as we may have order of magnitudes more non-aliases than aliases, this sounds a bit weird to me. Shouldn't we create a cache of entries which are referrals instead of the opposite ?

You need both I think.  Again this is a big discussion.
This is what we do for the referrals, and we initiate this cache when starting the server, reading all the entries with the referral AT.

Well there is a difference here but I don't know how much it matters.  First the aliases have system indices whereas referrals do not have indices but they could as user indices.  Both will impact search.

Also another question : we are handling aliases in the ExceptionInterceptor. Wouldn't it be better to do so in an AliasInterceptor ?

Wdyt ?

Maybe. I don't recall enough information at this point.  But the bottom line is this is hairy stuff that should not be changed lightly.  I would not mess with it until the big picture impact is understood or else it will result in a lot of pain. Also any work in this area needs to be conducted in a branch because even the slightest changes can have a massive impact on search.

PS: we may differ the Alias modifications to another version, as it may take more than a couple of days to fix it. Also we need more tests for aliases : currently, I'm not sure we have some !

We did have tests for aliases.  We should still have them but who knows.  You're absolutely right about the number of days it would take.  But the bottom line is the feature is complete yet by design some specific limitations were purposefully applied.  Think about it how many times do you need alias chaining and aliases back to ancestors?