directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emmanuel Lécharny <elecha...@gmail.com>
Subject Re: Substring filter : how to improve their use ?
Date Fri, 22 Feb 2013 05:37:02 GMT
Le 2/22/13 5:04 AM, Kiran Ayyagari a écrit :
> On Thu, Feb 21, 2013 at 11:44 PM, Emmanuel Lécharny <elecharny@gmail.com>wrote:
>
>> Hi guys,
>>
>> currently, when we do a search, we try to evaluate the number of
>> candidates each node in the filter expression will return. For instance,
>> such a filter :
>>
>> (&(cn=test)(sn=acme))
>>
>> will evaluate the number of entries with a CN = test and the number of
>> entries with a SN = acme. Let say the former has 10 entries and the last
>> has 5 entries, we will annotate each node with this number :
>>
>>
>> (&:[5](cn=test:[10])(sn=acme:[5]))
>>
>> Here, the & will have the lowest number, and at the end, we will select
>> the node whith the lowest number as a base to fetch the entries.
>>
>> So far, so good (I don't ant to go any further about how we evaluate
>> other operators, like | and !, it's way more complex but irrelevant here).
>>
>> Every filter will also have an additional node added : the scope. The
>> scope allows us to limit the number of entries to fetch, depending on
>> where in the tree we are. That can save some time.
>>
>> In our example, let's say we have 1000 entries under ou=system, 100
>> under ou=users,ou=system and 3 under ou=unitTest,ou=users,ou=system. The
>> filter will be transformed internally to this filter (assuming the scope
>> is SUBTREE) :
>>
>> (&(scope=SUB)(cn=test)(sn=acme))
>>
>> and annoted as :
>>
>>
>> base search = "ou=system" :
>> (&:[5](scope=SUB[1000](cn=test:[10])(sn=acme:[5]))
>>
>> base search = "ou=users,ou=system" :
>> (&:[5](scope=SUB[100](cn=test:[10])(sn=acme:[5]))
>>
>> base search = "ou=unitTest,ou=users,ou=system" :
>> (&:[3](scope=SUB[3](cn=test:[10])(sn=acme:[5]))
>>
>> As we can see, the scope can change the evaluation. In the last case, we
>> will pick the scope node to fetch the entries, as we will only get 3 of
>> them.
>>
>> Ok, let's go a step further. Those filters were pretty simple to handle,
>> as the underlying database know how many entries will be fetched when
>> handling a filter like (cn=test).
>>
>> Substring, <=, >= filters are way different. We can't know many entries
>> will be >= a key, or which match the substring. For instance, how do we
>> know how many entries will match cn=te* ? We can find the first key
>> matching cn=te in the index, but we will have to walk the index up (or
>> down) to find the entries matching cn=tf*. That means we have acess to
>> the ordering method, to build the key which comes after 'te' (ie, 'tf'),
>> something which is quite difficult to do atm...
>>
>> so a filter like (&(cn=te*)(sn=ac*)) with SUBTREE scope will transform to
> something like
>
> (&:[3](scope=SUB[3](cn=te*:[10])(sn=ac*:[10]))
>
> is that correct?

Correct.


-- 
Regards,
Cordialement,
Emmanuel Lécharny
www.iktek.com 


Mime
View raw message