directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From conflue...@apache.org
Subject [CONF] Apache Directory Server v1.5 > SearchEngine and Optimizer
Date Mon, 26 Mar 2012 15:18:00 GMT
<html>
<head>
    <base href="https://cwiki.apache.org/confluence">
            <link rel="stylesheet" href="/confluence/s/2042/9/1/_/styles/combined.css?spaceKey=DIRxSRVx11&amp;forWysiwyg=true"
type="text/css">
    </head>
<body style="background: white;" bgcolor="white" class="email-body">
<div id="pageContent">
<div id="notificationFormat">
<div class="wiki-content">
<div class="email">
    <h2><a href="https://cwiki.apache.org/confluence/display/DIRxSRVx11/SearchEngine+and+Optimizer">SearchEngine
and Optimizer</a></h2>
    <h4>Page <b>edited</b> by             <a href="https://cwiki.apache.org/confluence/display/~elecharny">Emmanuel
L├ęcharny</a>
    </h4>
        <br/>
                         <h4>Changes (1)</h4>
                                 
    
<div id="page-diffs">
                    <table class="diff" cellpadding="0" cellspacing="0">
    
            <tr><td class="diff-snipped" >...<br></td></tr>
            <tr><td class="diff-unchanged" >* the attribute to return for entries
<br> <br></td></tr>
            <tr><td class="diff-deleted-lines" style="color:#999;background-color:#fdd;text-decoration:line-through;">{warning:title=Normalization
of Attributes to Return} <br>Presently the SearchOperationContext contains a JNDI SearchControls
instance to encapsulate some of these parameters.  The attribute identifiers used in the SearchControls
for the attributes to return may or may not be normalized to the attributeType OID.  These
should be normalized to provide either a set of AttributeType OID&#39;s or a set of AttributeType
objects themselves. <br>{warning} <br> <br> <br></td></tr>
            <tr><td class="diff-unchanged" >{section} <br>{column} <br></td></tr>
            <tr><td class="diff-snipped" >...<br></td></tr>
    
            </table>
    </div>                            <h4>Full Content</h4>
                    <div class="notificationGreySide">
        <h1><a name="SearchEngineandOptimizer-Introduction"></a>Introduction</h1>

<p>The SearchEngine conducts search operations on a Partition using this indexed master
table scheme.  It requires the presence of the system indices mentioned to evaluate LDAP filters
on the part of the DIT contained in such a Partition.  SearchEngine is an interface with a
default implementation which would even if we implemented such a partition using some other
BTree implementation besides JDBM.  We could use a custom in memory BTree implementation like
an AvlTree as the basis for Tables for a new kind of Partition, and the DefaultSearch engine
will be able to conduct search operations on it.  We could even use SleepyCat's BDB JE for
another disk based BTree implementation for Tables, to build a new kind of Partition based
on this scheme while using the DefaultSearchEngine to search it.</p>

<p>Optimizer and it's default implementation, the DefaultOptimizer, annotates search
filter AST's to provide optimization hints to the SearchEngine.  Like the SearchEngine the
optimizer implementation can operate on any Partition implementation using this indexed master
table scheme containing the same required system indices.</p>

<p>In this document we define the search algorithm and optimization techniques applied
by the DefaultSearchEngine and the DefaultOptimizer.</p>

<h1><a name="SearchEngineandOptimizer-SearchParameters"></a>Search Parameters</h1>

<p>ApacheDS provides all partitions with parameters to a search operation using a SearchOperationContext
object.  This object contains:</p>

<ul>
	<li>the search scope</li>
	<li>the search base dn</li>
	<li>the alias dereferencing mode</li>
	<li>the search filter as an abstract syntax tree</li>
	<li>the search time and size limits</li>
	<li>the attribute to return for entries</li>
</ul>


<table class="sectionMacro" border="0" cellpadding="5" cellspacing="0" width="100%"><tbody><tr>
<td class="confluenceTd" valign="top">
<p><span class="image-wrap" style=""><img src="/confluence/download/attachments/80564/example_filter.png?version=1&amp;modificationDate=1206209099000"
style="border: 0px solid black" /></span></p>

<p><span class="image-wrap" style=""><img src="/confluence/download/attachments/80564/example_filter_normalized.png?version=1&amp;modificationDate=1206209104000"
style="border: 0px solid black" /></span></p>

<p><span class="image-wrap" style=""><img src="/confluence/download/attachments/80564/example_filter_modified.png?version=1&amp;modificationDate=1206209844000"
style="border: 0px solid black" /></span></p></td>
<td class="confluenceTd" valign="top">
<h2><a name="SearchEngineandOptimizer-SearchFilterAST"></a>Search Filter
AST</h2>

<p>Most of the search parameters are self explanatory with the exception of the search
filter abstract syntax tree.  For each search request ApacheDS produces an AST representation
of the search filter.  The branch nodes of this filter tree represent the boolean filter operators
<b>AND</b> (<b>&amp;</b>), <b>OR</b> (<b>|</b>),
<b>NOT</b> (<b>&#33;</b>).  The leaf nodes of this filter represent
atomic assertions in the filter.  See the example filter and the AST representation for it:</p>

<div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent
panelContent">
<pre>*(&amp; (| (organizationalUnitName=Sales) (OU=BOARD    of directors) ) 
    (! (localityName=sUnnYVale) ) 
    (2.5.4.0=peRSOn) )
</pre>
</div></div>

<p>After generating the AST for this filter, the server will normalize assertion nodes
(the leaf nodes) and pass the resulting AST to the PartitionNexus.  The PartitionNexus forwards
this to the partition containing the search base.  Here's the normalized version of this filter
and it's AST:</p>

<div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent
panelContent">
<pre>*(&amp; (| (2.5.4.11=sales) (2.5.4.11=board of directors) ) 
    (! (2.5.4.7=sunnyvale) ) 
    (2.5.4.0=person) )
</pre>
</div></div>


<h3><a name="SearchEngineandOptimizer-ScopePreparation"></a>Scope Preparation</h3>

<p>Upon receiving the normalized search filter AST, the default search engine performs
some basic scope analysis.  The results of this analysis produce a special tree node called
a ScopeNode.  The scope node is injected into the filter AST rearranging it.  A new <b>AND</b>
node is used as the root.  The old AST is inserted into the new root as a child.  The scope
node is also inserted as a child.  To the left is the resultant modified filter AST using
our example filter.</p>

<p>The ScopeNode embeds an additional scope constraint assertion into the filter.  It
enforces the scope constraint while conducting search operations.  The partition adds this
node now before invoking the optimizer so scope constraints can be considered for optimization.
 The optimizer is aware and knows how to handle this special ScopeNode containing all the
need scope information or this search operation.</p>

<h3><a name="SearchEngineandOptimizer-AliasDereferencing"></a>Alias Dereferencing</h3>

<p>The ScopeNode does not represent the simple scope parameter for one, single and subtree.
 It represents the overall scope of filter applicability: the scope constrained candidate
set a filter should be applied to.  This factors in scope expansion due to alias dereferencing.
 Each alias encountered while dereferencing in any of the modes is added as a unique base
to search.  This represents the complete scope.</p>

<p>As mentioned before in the <a href="/confluence/display/DIRxSRVx11/Structure+and+Organization"
title="Structure and Organization">Structure and Organization</a> page, a set of
aliases are found which refer to entries expanding the scope.  These aliases are dereferenced
to list all the unique roots that must now be searched to correctly evaluate the filter across
this expanded scope.</p></td></tr></tbody></table>

<h1><a name="SearchEngineandOptimizer-OptimizationwithAnnotations"></a>Optimization
with Annotations</h1>

<p>The next step in the process of preparing a filter AST is to annotate it with cost
annotations.  This is the task of optimizer.  The default optimizer uses scan counts on indices
to determine costs.  Each node in the AST inlcuding branch nodes, and the scope node are has
a count attribute associated with it.  The scan count approach used by the default optimizer
approximates a best guess on the worst case scenario.  The count is a guess on how big the
result set could be for entries satisfying the constraints in the filter subtree.</p>

<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td>The count attribute should be
renamed to cost since different optimizer implementations may use different tactics besides
simple scan counts to optimize search.  The default optimizer just uses index scan counts.
 This may change or be enhanced in the future.</td></tr></table></div>


<p>The optimizer performs a depth first traversal of the AST annotating the children
of branch nodes, before trying to annotate the branch node itself.  When simple leaf nodes
are entered, the optimizer checks to see if an index exists for the assertion attribute of
the leaf node.  If an index is available, it is used to get a quick count of the number of
entries in the partition that would satisfy the asserted condition in the leaf node.  This
figure is set as the count annotation on the leaf node.  A technique exists for quickly looking
up scan counts for assertion operators (&gt;, &lt;, =, ~=, =*, approx, substr, scope).</p>

<div class='panelMacro'><table class='tipMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/check.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td><b>Substring Assertions
Are Not Cheap</b><br />ApacheDS does not use a specialized index for substring
operator evaluation. Instead it walks an equality/ordering index with regular expression evaluators.
 This is costly since it requires pattern matching on anything but the most simplistic cases.
 Anything that can limit the region of the index scanned is worth using.  For example if ou
is indexed and the following filter is used, (ou=Eng*), then the index can natively limit
our scan to only those entries starting with 'Eng' without having to use any regular expression
evaluations.  If instead the following filter is used, (ou=Eng*Managers), then we still limit
the range natively to those index entries where ou starts with Eng, however a regular expression
must be used to make sure the value ends with 'Managers'.  One would think the worst case
is (ou=*) but this is not the substring operator, it is the existence operator which can quickly
find the scan count which is just the size of the index: no regular expression is needed.
 The worse of the worst cases is a substring search expression without an initial anchor (no
init component) like (ou=*Managers).  Such a substring filter assertion will result in a full
index scan while using a regular expression on each index entry to assert whether or not the
ou value ends in 'Managers'.  The optimizer avoid this and takes an educated guess, otherwise
it would spend too much time trying to optimize making it's function worthless.</td></tr></table></div>

<p>Once all the children of a branch node have been annotated, the branch node must
be annotated using the annotation results from it's children.  </p>

<p><b>AND</b> nodes are annotated with the lowest scan count of it's children.
 This is the absolute worst case, or maximum result set size, that could be accepted by the
logic in this <b>AND</b> branch of the filter.  This makes sense since logical
<b>AND</b> operations compute the intersection.  The most that could intersect
is the smallest set.  </p>

<p><b>OR</b> nodes are annotated with the sum of the it's children's scan
counts.  This is the absolute worst case, or maximum result set size.  Again this makes sense
since the logical <b>OR</b> operation computes the union of the sets.  The most
that could be returned is the sum of all possible entries that could be returned by <b>OR</b>
node children.</p>

<p><b>NOT</b> nodes contain a single child.  With the present indexing scheme
the best we can do is subtract the scan count of the child, from the size of the complete
set of entries.  This works as the absolute worst case and makes sense since logical <b>NOT</b>
operations invert the selection.</p>

<div class='panelMacro'><table class='noteMacro'><colgroup><col width='24'><col></colgroup><tr><td
valign='top'><img src="/confluence/images/icons/emoticons/warning.gif" width="16" height="16"
align="absmiddle" alt="" border="0"></td><td>Perhaps not here but we need to
make a note about how filters with NOT operations could potentially result in full index scans
which are extremely costly in a very large directory (VLD).  Some possible tactics may remove
these problems at a cost.  For example we thought about adding an inverse LUT to indices just
for this purpose but that was not feasible: how do you generate all the combinations for what
values an entry attribute does not have?</td></tr></table></div>

<p>This process is continued until the root of the AST is reached.  At this point we
have a rough estimate of the worst case for the entire filter. </p>

<h1><a name="SearchEngineandOptimizer-SearchAlgorithm%2CandCursorConstruction"></a>Search
Algorithm, and Cursor Construction</h1>

<p>The call to search() on the search engine does <b>not</b> synchronously
search the partition to construct a result set to return.  Doing so would require potentially
massive buffers to store the results and increase response latency.  An efficient and fast
server must minimize the memory footprint of each active search request so it can process
as many concurrent requests as possible.</p>

<p>Instead of computing the results, the search engine constructs a Cursor that can
produce the resultant entries matching the filter and scope constraints.  This Cursor may
be a composite cursor potentially wrapping other Cursors depending on the filter.  Once the
Cursor is constructed and returned to be used by higher levels, it can traverse the results
one by one without having to buffer the entire result set matching the filter.  ApacheDS'
LDAP front end can then advance forward or backwards, or even to the first or last result
using this Cursor.  Usually the front end will pull a candidate entry and send it out the
door before getting the next entry to prevent having to buffer more than one entry at a time.
 Calls on the Cursor interface to position it leverage partition indices and filter assertion
logic to compute the next or previous candidate.  The Cursor is a critical pattern in the
efficient operation of the search algorithm.</p>

<h2><a name="SearchEngineandOptimizer-AssertionEvaluatorsandCandidateCursors"></a>Assertion
Evaluators and Candidate Cursors</h2>


<p>There are many ways a Cursor can be built to traverse candidates that satisfying
a filter.  Some ways are less efficient than others, in that they will result in more computation
and require more IO to evaluate the same filter with each positioning and advance.  The search
engine uses the scan count annotations on nodes to govern how it builds an optimal Cursor.
 It uses a recursive Cursor building visitor on the nodes of the AST.  The visitor takes actions
to create different Evaluators and Cursors based on the node type.  Cursors generate candidates
as IndexEntry objects that satisfying filter expressions.  Evaluators evaluate expressions
on candidates generated from Cursors returning IndexEntry objects.  </p>

<p>At branch nodes, the visitor makes a decision how to build candidate generating Cursors
and Evaluators.</p>

<p>At <b>NOT</b> branch nodes, the visitor creates a Cursor on the NDN index.
 Since <b>NOT</b> branch node types only possess a single child, that child is
used to create a single Evaluator.  The Evaluator and Cursor pair are used to build a NotCursor.
 On advances and positioning attempts, the NotCursor uses the NDN index Cursor to generate
candidates, then checks to make sure the Evaluator returns false for that candidate.  NotCursors
incur a full table scan on the NDN index since they must make sure the child filter does not
match a candidate to accept the candidate.  There is no way to build a Cursor on the child
expression and figure out the negation.</p>

<p>At <b>AND</b> branch nodes, the visitor selects a child with the smallest
scan count and uses that child AST node to build a candidate generating Cursor.  The siblings
of the selected child are used to build a list of Evaluators.  The Cursor and list of Evaluators
are used to construct an AndCursor.  On advances and positioning operations the Cursor finds
candidates that satisfy the <b>AND</b> node expression by using the child Cursor
to generate candidates while using the list of Evaluators to verify a match.  All sibling
Evaluators must match the candidate for the AndCursor to consider a candidate as having matched
the <b>AND</b> node's filter expression.</p>

<p>At <b>OR</b> branch nodes, the visitor creates a list of Cursors for
each child node.  These Cursors are combined to build an OrCursor.  The OrCursor walks these
child Cursors one at a time.  Since an <b>OR</b> operation takes the union of
all candidates, each child Cursor's candidate is accepted by the OrCursor.  </p>

<p>Both Enumerator building and Cursor building operations are recursive.  Both stop
recursing when they reach leaf nodes which either produce a simple leaf expression Evaluator
or a Cursor off of a partition index associated with the leaf node's assertion attribute.
</p>

<h2><a name="SearchEngineandOptimizer-CommentsonSearch"></a>Comments on
Search</h2>

<p>Annotations only help optimize search when search expressions contain <b>AND</b>
expressions.  As noted above the scope node addition automatically makes every AST fed into
the optimizer a conjunction.  Plus the scope node can efficiently produce scan count information
thanks to the onelevel and sublevel indices.  So whatever the filter, leveraging the optimizer
will help.</p>

<p>We mentioned that some Cursors may not be optimal because they could generate more
IO and require more calculations to determine candidacy on Cursor positioning and advances.
To understand this let's consider the Cursors we could build on an example filter expression:</p>

<div class="preformatted panel" style="border-width: 1px;"><div class="preformattedContent
panelContent">
<pre>(&amp; (ou=engineering) (l=Sunnyvale))
</pre>
</div></div>

<p>For simplicity we will not consider scope in this example at first.  Also for now
consider there to be an index on the ou and l attributes.  According to this filter expression
we can build two kinds of Cursors.  </p>

<p>The first Cursor would generate candidates from the ou index using a Cursor for ou=engineering
leaf node.  Each candidate returned represents an entry in the partition who's ou attribute
equals the value 'engineering'.  The Cursor would use the localityName (l) index to perform
lookups for each candidate returned from the ou=engineering Cursor.  Lookups will check to
make sure an IndexEntry exists with id equal to the candidate and value equal to 'sunnyvale'.
</p>

<p>The second Cursor would generate candidates from the localityName index for l=sunnyvale
leaf node.  Each candidate returned represents an entry in the partition who's localityName
equals the value 'sunnyvale'.  The Cursor would use the ou index to perform lookups for each
candidate returned from the l=sunnyvale Cursor.  Lookups will check to make sure an IndexEntry
exists with id equal to the candidate and a value equal to 'engineering'</p>

<p>If there are 100 people in the engineering department yet 1000 people in the company
are located in Sunnyvale then the first Cursor will perform 100 iterations as it walks the
ou index over Tuples with a key of 'engineering'.  For each candidate (of which there are
100), this Cursor will perform a single lookup on the localityName index.  To satisfy this
search a total of 100 Cursor advances occur with 100 lookup operations.  In the case of the
second Cursor, 1000 iterations are performed as it walks the localityName index over Tuples
with a key of 'sunnyvale'.  For each candidate (of which there are 1000), this second Cursor
will perform a single lookup on the ou index.  To satisfy this search a total of 1000 Cursor
advances occur with 1000 lookup operations.</p>

<p>Choosing to build the second less efficient Cursor rather than the first Cursor costs
10 times as much in IO and processing.  This is why the use of scan counts set by the optimizer
are used to select the child with the smallest scan count in <b>AND</b> nodes.
 And as stated before, the addition of the scope node, always includes an <b>AND</b>
node within the AST.  Hence the optimizer should pay off in most search operations.  This
is why it is enabled by default in the JdbmPartition.</p>


<h1><a name="SearchEngineandOptimizer-CompositeIndices"></a>Composite Indices
</h1>

<p>Talk about future work regarding indices on common search filters.  These indices
represent filter matches.</p>
    </div>
        <div id="commentsSection" class="wiki-content pageSection">
        <div style="float: right;">
            <a href="https://cwiki.apache.org/confluence/users/viewnotifications.action"
class="grey">Change Notification Preferences</a>
        </div>
        <a href="https://cwiki.apache.org/confluence/display/DIRxSRVx11/SearchEngine+and+Optimizer">View
Online</a>
        |
        <a href="https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=80564&revisedVersion=17&originalVersion=16">View
Changes</a>
            </div>
</div>
</div>
</div>
</div>
</body>
</html>

Mime
View raw message