directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel L├ęcharny <elecha...@apache.org>
Subject Re: Rdn and Ava cleanup
Date Thu, 24 Feb 2011 13:39:39 GMT
On 2/24/11 3:01 AM, Alex Karasulu wrote:
> On Wed, Feb 23, 2011 at 12:14 PM, Emmanuel Lecharny<elecharny@gmail.com>  wrote:
>> Hi guys,
>>
>> those last 2 days I did some cleanup in the Rdn and Ava classes, in th
>> espirit of Dn cleanup. Here is a summary of what has been done, roughly :
>>
>> - injection of the SchemaManager in both classes' constructors
>> - removing of the compareTo method
>> - made some methods private
>> - made the classes final
>> - removed the Comparable interface from the equation
>>
>> One important modification is the last one : it makes no sense to have a Rdn
>> be comparable. First, how do we compare a cn and a jpegPhoto ? Second, how
>> do we compare two RDN which attributeType does not have an equality matching
>> rule ?
>>
>> Of course, this has an impact in the way the backend works, as the Rdn index
>> needs to be able to do ordered comparison between Rdn, as this index is a
>> BTree. What I did is that I replaced the Rdn.compareTo(Rdn) method by a
>> direct String comparison in Jdbm between the Rdn's normalized name. That
>> does work.
>>
>> It made me think that maybe using a hashed index for Rdn is probably a
>> better idea, because then we won't need this comparison to be done (the
>> equals method would be enough) and also because it would be faster (finding
>> an element in a Hash table is an O(1) operation - at least, theorically -
>> when looking in a BTree is an O(log2(N)))
>>
>> thoughts on this last point?
> I just replied to Stefan's post first but I thought about this after
> my reply. Here's what came to mind:
>
> (1) Let's presume we back the one and sublevel indices with the RDN index.
> (2) Do we need to sort either of these three indices when they are
> coupled this way or all independent?
>
> We don't need to really sort them, we just need to be able to walk the
> index producing candidates for one level and sublevel relationships.
> We could probably still do this and get away without using a sorted
> B+Tree.
>
> The thing is we cannot apply certain assertions any longer with a hash
> like using the>=.<=, and substring assertions.
DN syntax and the ATs that use it does not provide a Substring or 
Ordering MR. Does not make a lot of sense, IMO (comparing DN for 
equality is ok, trying to order them is just a vain thing)

> But this is not
> something performed on these indices anyway. You do these with respect
> to an attribute type which that might be indexed and do not use these
> operators on one level, or sub level inquiries.
>
> So yeah I think this can be done and may have big performance
> benefits. But it's a big shift.
This is an interesting side effect of my question :) I haven't expected 
such a wide modification proposal when I asked about using Hash instead 
of BTree, but this is were it's start to become interesting ! 
Definitively to be investigated, and hopefully Stefan has already 
started to dig in this direction with his Hashbase implementation.
> Fun project too.
Sure is !


-- 
Regards,
Cordialement,
Emmanuel L├ęcharny
www.iktek.com


Mime
View raw message