Mailing-List: contact dev-help@directory.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Apache Directory Developers List" <dev@directory.apache.org>
Received-SPF: pass (nike.apache.org: domain of akarasulu@gmail.com designates
 74.125.82.44 as permitted sender)
MIME-Version: 1.0
Sender: akarasulu@gmail.com
In-Reply-To: <4F6B5F4B.5000209@gmail.com>
References: <4F6B5F4B.5000209@gmail.com>
Date: Mon, 26 Mar 2012 15:32:02 +0300
Message-ID: 
 <CADwPi+GpE4EihOhV1Xo4CK4hT5s0ysaYoiU8FDdU_9o27NZkYw@mail.gmail.com>
Subject: Re: [index] reverse index usage for user attributes
From: Alex Karasulu <akarasulu@apache.org>
To: Apache Directory Developers List <dev@directory.apache.org>
Content-Type: multipart/alternative; boundary=0016e6de00d5b8009904bc248f02

--0016e6de00d5b8009904bc248f02
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

On Thu, Mar 22, 2012 at 7:20 PM, Emmanuel L=E9charny <elecharny@gmail.com>w=
rote:

> Hi,
>
> Currently, we create a forward and a reverse index for each index
> attribute. The forward index is used when we annotate a filter, searching
> for the number of candidate it filters. The reverse index usage is a bit
> more subtile.
>
>
[Hope you don't mind below I just use concise English to express the
process - not trying to insult your description - apologies if it sounds
this way.]

To state this more concisely, the reverse index facilitates rapid Evaluator
evaluations of non-candidate generating assertions in the filter AST. As
you know, one assertion in the filter (scope assertions included) is
selected as the candidate generating assertion which uses a Cursor to
generate candidates matching it's logic. The other filter assertions in the
AST are used to evaluate whether or not the generated candidates match.

NOTE: the optimizer annotates the AST's nodes with scan counts and this is
used to drive selection of the proper candidate generating assertion in the
AST. This is done using a DFS through the AST to find the lowest scan count
containing leaf node (an assertion). This reduces the search set.

Let's consider a filter like (cn=3Djo*). As we don't have any clue about th=
e
> value (it will start with 'jo', but that's all we know), we can't use the
> 'cn' forward index.


Yep the scan count annotation used for the cn=3Djo* assertion will be the
total count of entries in the DIB since the BTree cannot give us a count
figure. So depending on what scope is used it will most likely be the scope
assertion that will drive candidate generation since it will most likely
have a smaller scan count.


> We can't use the 'cn' reverse index either, at least directly. So the
> engine will use the scope to determinate a list of candidate : {E1, E2, .=
..
> En}. Now, instead of fetching all those entries from the master table, wh=
at
> we can do is to use the 'cn' reverse table, as we have a list of entry ID=
s.
>
> For each entry id in {E1, E2, ... En}, we will check if the 'cn' reverse
> table contain some value starting with 'jo'.
>
>
Yep each candidate generated from the scope assertion E1, E2, ... En will
use the ID to look into the reverse BTree and check if the value for that
candidate matches jo*.


> If th 'cn' index contains the two following table :
> forward =3D
>  john --> {E1, E3}
>  joe  --> {E2, E3, E5}
>  ...
>
> reverse =3D
>  E1 --> john
>  E2 --> joe
>  E3 --> joe, john
>  E5 --> joe
>  ...
>
>
Yep.


> then using the reverse table, we will find that the E1, E2, E3 and E5
> entry match, when all the other aren't. No need to fetch the E4, ... En
> entries from the master table.
>
> Now, exploiting this rverse table means we read a btree. Which has the
> same cost than reading the Master Table (except that we won't have to
> deserialize the entry).
>
>
IO time agreed.


> What if we don't have a reverse table ?
>
> We will have to deserialize the entries, all of them from the {E1, E2, ..=
.
> En} set filtered by the scope.
>
> Is this better then to build a reverse index ?


The only issue with this is that it will churn the entry cache. Meaning
there's some proximity value in a settled cache due to the kinds of queries
that generally occur. Optimally a cache should contain those entires most
often looked up.

A large master table scan will whip away a settled cache's knowledge. Using
a reverse index instead has more value in this case. It's a delicate
balance.


> Not necessarily. In the case where we have more than one node in the
> filter, for instance (&(cn=3Djo*)(sn=3Dtest)(**objectClass=3Dperson)), th=
en
> using the reverse index means we will access as many btrees as we have
> index attributes in the filter node. Here, if cn, sn and objectClass are
> indexed, we will potentially access 4 btrees (the scope will force us to
> consider using the oneLevel index or the subLevel index).
>
> At the end, it might be more costly, compared to using the entry and matc=
h
> it against the nodes in the filter.
>
>
Interesting point! The scan counts might help us out on a better
optimization for these kinds of cases.

If the search set is constrained (below some configurable threshold i.e.
10-50 entries) and if the filter uses many indices (above some threshold
i.e. 3-4 indices) then it might be better to just pull from the master
table directly without leveraging indices.

This will optimize for speed without blowing out the cache memory. What do
you think?


> When we have many entries already in the cache, thus sparing the cost of =
a
> deserialization, then accessing more than one BTree might be costly,
> comparing to using the entries themselves.
>
>
Agreed but again let me stress protecting the cache memory from a large
master table scan. I think we can take this strategy for the cases
mentioned above.


> An alternative
> --------------
>
> The pb is that the Node in the filter use a substring : jo*. Our index ar=
e
> built using the full normalized value. That does not fit.
>
>
+1


> What if instead of browsing the underlying tree, we use a specific
> compare() method instead of an equals() method ? As the tree is ordered,
> finding the subset of entries taht will be valid candidate is just a matt=
er
> of finding the part of the tree which match the comparison.
>
> In our example, the compare() method will compare the first caracters fro=
m
> the keys to the filter value. Here, the compare method will be the
> String.startWith(), and if it's not true, we will do a compareTo() to kno=
w
> if we should go donw th tree on the left or on the right of the current k=
ey.
>
>
You know I think we might be doing this. I remember lining up a Cursor on
these boundaries for the sake of evaluation but this code might NOT be
executed because the scan counts are using total DIB count for substring
expressions.


> For that, we just need the forward index, the reverse index is useless.
>
>
I think this works for when you're generating candidates for an assertion
using the substring operator. The reverse lookups will still be needed on
the generated candidates and doing away with this is not a wise idea for
large search domains since it is guaranteed to wipe away the cache memory.


> If we use a substring with a '*' at the beginning, then we need another
> index : a reverted index. Tht means all the values will be stored reverte=
d
> : john will be stored as 'nhoj', so doing a search on (cn=3D*hn) will be =
fast.
>
>
Yes we thought about doing this to handle substrings that have ONLY an
'end' component to them but we decided not to deal with it. This is perhaps
the worst one of the most inefficient constructs minus the NOT operator.

I think removing the reverse index (which I would love to do) is something
that can be experimented with but we don't have a diverse test set to show
the true impact across different scenarios. And until we have a way to
understand the impact across different kinds of DIBs I'm afraid our
reasoning for one situation verses another is not good enough to warrant
the change. This is not a simple matter we should come to a decision on
with even an intelligent guess.

--=20
Best Regards,
-- Alex

--0016e6de00d5b8009904bc248f02
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><br><div class=3D"gmail_quote">On Thu, Mar 22, 2012 at 7:20 PM, Emmanue=
l L=E9charny <span dir=3D"ltr">&lt;<a href=3D"mailto:elecharny@gmail.com">e=
lecharny@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quot=
e" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi,<br>
<br>
Currently, we create a forward and a reverse index for each index attribute=
. The forward index is used when we annotate a filter, searching for the nu=
mber of candidate it filters. The reverse index usage is a bit more subtile=
.<br>

<br></blockquote><div><br></div><div>[Hope you don&#39;t mind below I just =
use concise English to express the process - not trying to insult your desc=
ription - apologies if it sounds this way.]=A0</div><div><br></div><div>
To state this more concisely, the reverse index facilitates rapid Evaluator=
 evaluations of non-candidate generating assertions in the filter AST. As y=
ou know, one assertion in the filter (scope assertions included) is selecte=
d as the candidate generating assertion which uses a Cursor to generate can=
didates matching it&#39;s logic. The other filter assertions in the AST are=
 used to evaluate whether or not the generated candidates match.=A0</div>
<div><br></div><div>NOTE: the optimizer annotates the AST&#39;s nodes with =
scan counts and this is used to drive selection of the proper candidate gen=
erating assertion in the AST. This is done using a DFS through the AST to f=
ind the lowest scan count containing leaf node (an assertion). This reduces=
 the search set.</div>
<div><br></div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex=
;border-left:1px #ccc solid;padding-left:1ex">
Let&#39;s consider a filter like (cn=3Djo*). As we don&#39;t have any clue =
about the value (it will start with &#39;jo&#39;, but that&#39;s all we kno=
w), we can&#39;t use the &#39;cn&#39; forward index. </blockquote><div><br>
</div><div>Yep the scan count annotation used for the cn=3Djo* assertion wi=
ll be the total count of entries in the DIB since the BTree cannot give us =
a count figure. So depending on what scope is used it will most likely be t=
he scope assertion that will drive candidate generation since it will most =
likely have a smaller scan count.</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex">We can&#39;t use the &#39;cn&#=
39; reverse index either, at least directly. So the engine will use the sco=
pe to determinate a list of candidate : {E1, E2, ... En}. Now, instead of f=
etching all those entries from the master table, what we can do is to use t=
he &#39;cn&#39; reverse table, as we have a list of entry IDs.<br>

<br>
For each entry id in {E1, E2, ... En}, we will check if the &#39;cn&#39; re=
verse table contain some value starting with &#39;jo&#39;.<br>
<br></blockquote><div><br></div><div>Yep each candidate generated from the =
scope assertion E1, E2, ... En will use the ID to look into the reverse BTr=
ee and check if the value for that candidate matches jo*. =A0</div><div>
=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;borde=
r-left:1px #ccc solid;padding-left:1ex">
If th &#39;cn&#39; index contains the two following table :<br>
forward =3D<br>
 =A0john --&gt; {E1, E3}<br>
 =A0joe =A0--&gt; {E2, E3, E5}<br>
 =A0...<br>
<br>
reverse =3D<br>
 =A0E1 --&gt; john<br>
 =A0E2 --&gt; joe<br>
 =A0E3 --&gt; joe, john<br>
 =A0E5 --&gt; joe<br>
 =A0...<br>
<br></blockquote><div><br></div><div>Yep.</div><div>=A0</div><blockquote cl=
ass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;p=
adding-left:1ex">
then using the reverse table, we will find that the E1, E2, E3 and E5 entry=
 match, when all the other aren&#39;t. No need to fetch the E4, ... En entr=
ies from the master table.<br>
<br>
Now, exploiting this rverse table means we read a btree. Which has the same=
 cost than reading the Master Table (except that we won&#39;t have to deser=
ialize the entry).<br>
<br></blockquote><div><br></div><div>IO time agreed. =A0</div><div>=A0</div=
><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1=
px #ccc solid;padding-left:1ex">
What if we don&#39;t have a reverse table ?<br>
<br>
We will have to deserialize the entries, all of them from the {E1, E2, ... =
En} set filtered by the scope.<br>
<br>
Is this better then to build a reverse index ? </blockquote><div><br></div>=
<div>The only issue with this is that it will churn the entry cache. Meanin=
g there&#39;s some proximity value in a settled cache due to the kinds of q=
ueries that generally occur. Optimally a cache should contain those entires=
 most often looked up.</div>
<div><br></div><div>A large master table scan will whip away a settled cach=
e&#39;s knowledge. Using a reverse index instead has more value in this cas=
e. It&#39;s a delicate balance.</div><div>=A0</div><blockquote class=3D"gma=
il_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-lef=
t:1ex">
Not necessarily. In the case where we have more than one node in the filter=
, for instance (&amp;(cn=3Djo*)(sn=3Dtest)(<u></u>objectClass=3Dperson)), t=
hen using the reverse index means we will access as many btrees as we have =
index attributes in the filter node. Here, if cn, sn and objectClass are in=
dexed, we will potentially access 4 btrees (the scope will force us to cons=
ider using the oneLevel index or the subLevel index).<br>

<br>
At the end, it might be more costly, compared to using the entry and match =
it against the nodes in the filter.<br>
<br></blockquote><div><br></div><div>Interesting point! The scan counts mig=
ht help us out on a better optimization for these kinds of cases.=A0</div><=
div><br></div><div>If the search set is constrained (below some configurabl=
e threshold i.e. 10-50 entries) and if the filter uses many indices (above =
some threshold i.e. 3-4 indices) then it might be better to just pull from =
the master table directly without leveraging indices.</div>
<div><br></div><div>This will optimize for speed without blowing out the ca=
che memory. What do you think?</div><div>=A0</div><blockquote class=3D"gmai=
l_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left=
:1ex">

When we have many entries already in the cache, thus sparing the cost of a =
deserialization, then accessing more than one BTree might be costly, compar=
ing to using the entries themselves.<br>
<br></blockquote><div><br></div><div>Agreed but again let me stress protect=
ing the cache memory from a large master table scan. I think we can take th=
is strategy for the cases mentioned above.</div><div>=A0</div><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;=
padding-left:1ex">

An alternative<br>
--------------<br>
<br>
The pb is that the Node in the filter use a substring : jo*. Our index are =
built using the full normalized value. That does not fit.<br>
<br></blockquote><div><br></div><div>+1</div><div>=A0</div><blockquote clas=
s=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;pad=
ding-left:1ex">
What if instead of browsing the underlying tree, we use a specific compare(=
) method instead of an equals() method ? As the tree is ordered, finding th=
e subset of entries taht will be valid candidate is just a matter of findin=
g the part of the tree which match the comparison.<br>

<br>
In our example, the compare() method will compare the first caracters from =
the keys to the filter value. Here, the compare method will be the String.s=
tartWith(), and if it&#39;s not true, we will do a compareTo() to know if w=
e should go donw th tree on the left or on the right of the current key.<br=
>

<br></blockquote><div><br></div><div>You know I think we might be doing thi=
s. I remember lining up a Cursor on these boundaries for the sake of evalua=
tion but this code might NOT be executed because the scan counts are using =
total DIB count for substring expressions.</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex">
For that, we just need the forward index, the reverse index is useless.<br>
<br></blockquote><div><br></div><div>I think this works for when you&#39;re=
 generating candidates for an assertion using the substring operator. The r=
everse lookups will still be needed on the generated candidates and doing a=
way with this is not a wise idea for large search domains since it is guara=
nteed to wipe away the cache memory. =A0</div>
<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex">
If we use a substring with a &#39;*&#39; at the beginning, then we need ano=
ther index : a reverted index. Tht means all the values will be stored reve=
rted : john will be stored as &#39;nhoj&#39;, so doing a search on (cn=3D*h=
n) will be fast.<br>

<br></blockquote><div><br></div><div>Yes we thought about doing this to han=
dle substrings that have ONLY an &#39;end&#39; component to them but we dec=
ided not to deal with it. This is perhaps the worst one of the most ineffic=
ient constructs minus the NOT operator.</div>
<div><br></div><div>I think removing the reverse index (which I would love =
to do) is something that can be experimented with but we don&#39;t have a d=
iverse test set to show the true impact across different scenarios. And unt=
il we have a way to understand the impact across different kinds of DIBs I&=
#39;m afraid our reasoning for one situation verses another is not good eno=
ugh to warrant the change. This is not a simple matter we should come to a =
decision on with even an intelligent guess.</div>
</div><div><br></div>-- <br><div>Best Regards,</div><div>-- Alex</div><br>

--0016e6de00d5b8009904bc248f02--