directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmanuel Lecharny (JIRA)" <j...@apache.org>
Subject [jira] Created: (DIRSERVER-800) Avoid normalization when searching
Date Sat, 16 Dec 2006 14:00:21 GMT
Avoid normalization when searching
----------------------------------

                 Key: DIRSERVER-800
                 URL: http://issues.apache.org/jira/browse/DIRSERVER-800
             Project: Directory ApacheDS
          Issue Type: Improvement
            Reporter: Emmanuel Lecharny
            Priority: Critical
             Fix For: 1.5.0


Ok, this will be an huge improvment, but also a huge modification. Here is the rational :
- each time we are searching for an entry using an attribute value, we are walking a B-Tree
doing a comparizon between the given value and the stored value. if we have N values stored
in the B-Tree, we will do something like Log2(N) comparizons. 
- Now, we have to be aware that those comparisons must be done against the normalized value
of the attribute, and using the assocaited MatchingRule. For instance, "Emmanuel", "EMMANUEL"
and "emmanuel" are supposed to be the same value if there type is CommonName, so we must do
an case insensitive comparison.
- a first optimization is already implemented : the incoming value is normalized _before_
the search, so we avoid a systematic normalization of the incoming attribute
- but we will normalized each values which have been found into the server
- again, an optimization is implemented : a cache is used to store normalized values, so it
happens that sometime, we just get the normalized value from the cache.
- but this cache has a limited size, and as it is a cache, each time we want to hit it, a
synchronization occurs, which will lead to some concurrent access to slowdown.

At this point, the question is  :
- why don't we store normalized values of attribute instead of simple values? The initial
(not normalized) value is always present in the entry, so there is no need to keep it into
the indexes.

This optimization could have a huge impact if it appears that, under heavy loads, the cache
synchronization leads to delayed access.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message