directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmanuel Lecharny (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DIRSERVER-933) Slow searches using a non-indexed attribute in a filter
Date Wed, 16 May 2007 17:17:16 GMT

    [ https://issues.apache.org/jira/browse/DIRSERVER-933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496353
] 

Emmanuel Lecharny commented on DIRSERVER-933:
---------------------------------------------

Interesting...

If you do a search starting at some point of the tree (dc=example,dc=org) with a filter mapped
on a non-indexed attribute, then this is just normal that you have to search through all the
values. 

*BUT*, what should be done is to first count the number of elements starting at "dc=example,dc=org"
point. We have to check that the counter associated with the hierarchical index we have is
used or not (100$ bet on *not*, as you got a full scan ;)



> Slow searches using a non-indexed attribute in a filter
> -------------------------------------------------------
>
>                 Key: DIRSERVER-933
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-933
>             Project: Directory ApacheDS
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 1.5.0
>            Reporter: Martin Alderson
>            Priority: Critical
>             Fix For: 1.5.1
>
>
> When searching for entries in a specific container with a filter such as (cn=*) and the
cn attribute is not indexed, the server has to test each entry in the partition even when
the search has been restricted to a container.
> As an example of how bad this could be - if a partition contains millions of entries
and the user does a search in that partition within a container that only contains 1 of those
entries, every entry in the partition is checked in turn even though the server knows there
is only one entry within the specified container.
> This is due to the search optimizer which annotates each part of the filter with the
number of entries that match where it can.  For those it can't (such as with attributes that
are not indexed) this 'count' will default to Long.MAX_VALUE - to indicate that it is the
worst case.  (See org.apache.directory.server.core.partition.impl.btree.DefaultOptimizer).
> When these count annotations are checked to decide which part of the filter to use first
they are dropped down to integers which means the items with the worst case value of Long.MAX_VALUE
become -1 -- effectively making them the best case.  (See org.apache.directory.server.core.partition.impl.btree.ExpressionEnumerator.enumConj).
> Disclaimer: I have not done any performance testing on this.  I just noticed the problem
while stepping through the code with a debugger.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message