directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <akaras...@apache.org>
Subject Re: Rdn and Ava cleanup
Date Thu, 24 Feb 2011 02:01:01 GMT
On Wed, Feb 23, 2011 at 12:14 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:
> Hi guys,
>
> those last 2 days I did some cleanup in the Rdn and Ava classes, in th
> espirit of Dn cleanup. Here is a summary of what has been done, roughly :
>
> - injection of the SchemaManager in both classes' constructors
> - removing of the compareTo method
> - made some methods private
> - made the classes final
> - removed the Comparable interface from the equation
>
> One important modification is the last one : it makes no sense to have a Rdn
> be comparable. First, how do we compare a cn and a jpegPhoto ? Second, how
> do we compare two RDN which attributeType does not have an equality matching
> rule ?
>
> Of course, this has an impact in the way the backend works, as the Rdn index
> needs to be able to do ordered comparison between Rdn, as this index is a
> BTree. What I did is that I replaced the Rdn.compareTo(Rdn) method by a
> direct String comparison in Jdbm between the Rdn's normalized name. That
> does work.
>
> It made me think that maybe using a hashed index for Rdn is probably a
> better idea, because then we won't need this comparison to be done (the
> equals method would be enough) and also because it would be faster (finding
> an element in a Hash table is an O(1) operation - at least, theorically -
> when looking in a BTree is an O(log2(N)))
>
> thoughts on this last point?

I just replied to Stefan's post first but I thought about this after
my reply. Here's what came to mind:

(1) Let's presume we back the one and sublevel indices with the RDN index.
(2) Do we need to sort either of these three indices when they are
coupled this way or all independent?

We don't need to really sort them, we just need to be able to walk the
index producing candidates for one level and sublevel relationships.
We could probably still do this and get away without using a sorted
B+Tree.

The thing is we cannot apply certain assertions any longer with a hash
like using the >=. <=, and substring assertions. But this is not
something performed on these indices anyway. You do these with respect
to an attribute type which that might be indexed and do not use these
operators on one level, or sub level inquiries.

So yeah I think this can be done and may have big performance
benefits. But it's a big shift.

Fun project too.

Regards,
Alex

Mime
View raw message