directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Emmanuel Lecharny (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DIRSERVER-2162) Searcing for users using ObjectClass=person takes long
Date Mon, 08 Aug 2016 12:12:20 GMT

    [ https://issues.apache.org/jira/browse/DIRSERVER-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411717#comment-15411717
] 

Emmanuel Lecharny commented on DIRSERVER-2162:
----------------------------------------------

First, let me ask you how many entries you have in your server ? Also, how many of them have
a {{person}} objectClass ? Also which ApacheDS version are you using ?

Now, regardless of those values, using a filter like {{(&(cn=username)(ObjectClass=*))}}
is equivalent to using {{(cn=username)}} : all entries have a {{ObjectClass}} attribute, so
the second art of the filter is simply discarded. Using a filter like {{(&(cn=username)(ObjectClass=person))}}
will work differently : as it's a {{AND}} filter, we will have to evaluate both filters to
know how many entries each one will return. Let's say you have 3 entries which match {{(cn=USERNAME)}}
and 10450 that match {{(ObjectClass=person)}}, the ext step will be to get the smaller set
(ie, the first) and check all of them against the second filter. All in all, we will fetch
3 entries from teh backend, and for each of them, we will check that it matches the {{(ObjectClass=person)}}
filter.

That does not tell a lot about what happens in your case, because we have to know how many
of entries will be selected for each of those filter elements.

One more thing : there is a third filter element that is not shown here, because it's added
behind the curtain : the subtree filter. Actually, every search done with a filter will have
an additional filter element added. So, in your case, the real filter will be : {{(&(&(cn=username)(ObjectClass=*))(subtree=ou=users,dc=xxx,dc=fi))}}
(kindof).

Youc an determinate the number of selected entries by activating the logs to get a DEBUG output
for the {{DefaultSearchEngine}} class. You will typically get an output like : 

{noformat}
Nb results : 3 for filter : ([3]&([3]cn=username)([10450]ObjectClass=person))
{noformat}

where the number between {{[}} and {{]}} are the number of candidates. That will give us some
information about why it's slower when you use {{person}} as a filter.

Note that for index like {{ObjctClass}}, where one value may refer to many entries, we use
a sub-index. As {{person}} is obviously used by potentially thousands of entries, we don't
store all of the entry's ID in a simple list associated to the {{person}} key, we use a B-tree
to store an ordered list of them. The rational being that adding an entry with a {{person}}
value into a list of thousands entry's ID would take way too long, while adding it in a B-tree
will only cost a few updates. Nevertheless, we *know* how many values are associated with
the {{person}} value, so the evaluation is clearly trivial, and does not cost anything, so
there is something else at play here.

If it's a bug, it has to be fixed. We will need for your input to understand what's going
on.

Thanks !

> Searcing for users using ObjectClass=person takes long
> ------------------------------------------------------
>
>                 Key: DIRSERVER-2162
>                 URL: https://issues.apache.org/jira/browse/DIRSERVER-2162
>             Project: Directory ApacheDS
>          Issue Type: Bug
>    Affects Versions: 2.0.0-M20
>            Reporter: John Peter
>
> When we do the below query the result takes long. Around 10-50 seconds.
> Search base: ou=users,dc=xxx,dc=fi
> Filter (&(cn=*USERNAME*)(objectClass=person))
> Scope: Subtree
> However the below query returns the result immediately
> Search base: ou=users,dc=xxx,dc=fi
> Filter (&(cn=*USERNAME*)(objectClass=*))
> Scope: Subtree
> Looking at the Partition settings it has Indexed attributes ObjectClass and cn.
> First both queries took long. Then we added cn to the index and rebooted apacheDS and
the second query got fast.
> It seems like a bug that using ObjectClass in the query makes it slow all tough it is
in the index.
> It seems something similar was reported before DIRSERVER-2048, but it says it's fixed
in M20 which we are using.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message