directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <aok...@bellsouth.net>
Subject Re: Various questions
Date Tue, 06 Jun 2006 02:01:48 GMT
David Boreham wrote:

>
>> I think the concept of applying all indexing to attributes is in
>> itself broken.  
>
>
> Not necessarily. It's 'un-broken' if your goal is ease of configuration.
> I agree that if the goal is ultimate performance for updates, then
> selective
> indexing is the way to go. However, for a wide range of deployments,
> today's hardware delivers performance far in excess of what's required.
> Sometimes sacrificing performance in the interests of ease of use is a
> good idea.
>
Ease of use and simplicity was the primary concern.  However we did
strike a balance between simplicity of implementation, utility,
performance and ease of use.  Let me explain.

(1) Implementing approx search and approx indices using soundex just did
not make much sense for the amount of complexity it would introduce. 
It's a feature that is seldom used and if used occurs with adhoc human
queries.

(2) Soundex and similar algorithms are all language dependent.  This
makes it very difficult to implement the feature properly across languages.

(3) Defaulting ~= (approx match) to a simple equality match operadtion
still complies with RFCs.

(4) Without the need for approximate matching all that remained was
substring, equality, and existance.  A single existance index is used
for all attributes.  If an attribute is indexed period existance entries
for that attribute will be added to the existance index.  Equality and
substring matching are all that remained and that is handled elegantly
using a combination of regular expression matching and index walking.  I
can go into the algorithm used if people are interested.

(5) If we implemented approx matching and allowed indices for it we
would have to double the number of indices per attribute the user wanted
indexed. Furthermore the number of index entries per attribute value can
be large.  This slows down the server when performing write operations. 
This is why writes are fast in ApacheDS.

(6) User's we found get confused by too many different kinds of
indices.  The server can make some simple decisions for the general case
and still out perform a traditional approach on the same platform. 
Since indexing is an implementation detail we can take this liberty.

In conclusion we made a judgement call that works for us.  If you're
specific needs require a server with approx search and indices on
attributeTypes based on approx search then you have other options.

Personally I'd recommend FDS as the best alternative to ApacheDS for all
these specific features and overall robustness.

Alex



Mime
View raw message