From Thu Jun 23 14:19:23 2011 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with SMTP id 0EEE24189 for ; Thu, 23 Jun 2011 14:19:23 +0000 (UTC) Received: (qmail 27292 invoked by uid 500); 23 Jun 2011 14:19:23 -0000 Delivered-To: Received: (qmail 27243 invoked by uid 500); 23 Jun 2011 14:19:22 -0000 Mailing-List: contact; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list Received: (qmail 27229 invoked by uid 99); 23 Jun 2011 14:19:22 -0000 Received: from (HELO ( by (qpsmtpd/0.29) with ESMTP; Thu, 23 Jun 2011 14:19:22 +0000 X-ASF-Spam-Status: No, hits=-1994.3 required=5.0 tests=ALL_TRUSTED,HTML_MESSAGE,MIME_HTML_ONLY X-Spam-Check-By: Received: from [] (HELO ( by (qpsmtpd/0.29) with ESMTP; Thu, 23 Jun 2011 14:19:21 +0000 Received: from thor (localhost []) by (8.13.8+Sun/8.13.8) with ESMTP id p5NEJ0gN018505 for ; Thu, 23 Jun 2011 14:19:00 GMT Date: Thu, 23 Jun 2011 10:19:00 -0400 (EDT) From: To: Message-ID: <22179902.4525.1308838740234.JavaMail.confluence@thor> Subject: [CONF] Apache Directory Server v1.5 > Index and IndexEntry MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Auto-Submitted: auto-generated

Index and IndexEntry

Page edited by Emmanuel L=C3=A9charny

Changes (1)

=20 =20
=20 =20

h2. Index Forward and Reverse tables

At the moment, an Index= is implemented using 2 tables :
* The Forward table, which maps the ke= y to a list of entry's ID in the master table
* The Reverse table, = which maps the entry's ID to the keys

The forward table is us= ed to retrieve the entry from the MasterTable, when we only have a key. The= key can be associate with more than one entry, so we will get back a curso= r on the associated entry's ID when we do a lookup.

The revers= e index is used when we do a search on a combination of attributes : (&= (cn=3Dtest)(sn=3Dacme)). If we suppose that the selected primary index will= be the *CN* index, then we will pull a list of entry's ID. Now, using = the *SN* reverse index, we can avoid pulling each entry from the MasterTabl= e, simply because we can check in the *SN* reverse table that the selected = IDs are present or not.

This might be superfluous : it&= #39;s enough to pull the *SN* entry's ID for the *acme* value, and do a= n intersection with the first set of entry's ID. The biggest advantage = would be that we will speed up the modification operations if we were to re= move those reverse tables.

h2. Index Cursor's Get In= dexEntry Objects


Full Content

Index Abstr= action

An Index is a bidirectional LUT (Look-Ups Table) which sorts keys (forwa= rd index) and values (reverse index). It can be built on any physical attr= ibute or a computed characteristic of an entry.

Index Tuples contain a value for the attribute and the entry identifier = which represents a pointer into the MasterTable. This way lookups into the= master table for entries with a specific attribute value are fast. For ex= ample, when conducting a search for all entries where the foo attrib= ute is equal to the String 'bar', the search engine looks first to see if a= n index is available for foo. If an Index on foo exists then= it is used to quickly list the identifiers of all entries with attribute <= b>foo equal to 'bar'. This lookup occurs in almost constant time and t= he Index can be walked to only return those identifiers over the Tuples wit= h key set to 'bar'. The following schema expose the relations between all t= hose elements :

Without an Index on foo, the following sequence of events must oc= cur on each entry in the master table:

  1. the entry must be read from the MasterTable as a byte[]
  2. =09
  3. the entry must then be deserialized
  4. =09
  5. the foo attribute must be extracted
  6. =09
  7. each value of the foo attribute must be compared to see if it= equals 'bar'

As you can see this is an expensive proposition. With very large directo= ries search may take a very long time or may return without the intended en= tries because time limits are reached. Indices are critical for increasing = search performance.

3D""Even if we have all the entries in the c= ache, this is still an O(n) operation. We just spare the disk acces= s and the deserialization phase.
LDAP amortizes the expense of building indice= s at add, modify and delete times in return for faster search performance. = LDAP servers for this reason are highly indexed and are better suited for = read intensive applications.

Index F= orward and Reverse tables

At the moment, an Index is implemented using 2 tables :

  • The Forward table, which maps the key to a list of entry's ID in the= master table
  • =09
  • The Reverse table, which maps the entry's ID to the keys

The forward table is used to retrieve the entry from the MasterTable, wh= en we only have a key. The key can be associate with more than one entry, s= o we will get back a cursor on the associated entry's ID when we do a looku= p.

The reverse index is used when we do a search on a combination of attrib= utes : (&(cn=3Dtest)(sn=3Dacme)). If we suppose that the selected prima= ry index will be the CN index, then we will pull a list of entry's I= D. Now, using the SN reverse index, we can avoid pulling each entry = from the MasterTable, simply because we can check in the SN reverse = table that the selected IDs are present or not.

3D""This might be superfluous : it's enough = to pull the SN entry's ID for the acme value, and do an inter= section with the first set of entry's ID. The biggest advantage would be th= at we will speed up the modification operations if we were to remove those = reverse tables.

= Index Cursor's Get IndexEntry Objects

Index Cursors, unlike Tables that return Tuples, return IndexEntry objec= ts instead on get() operations after positioning the Cursor. Indices always= use a Long identifier into the master table for forward LUT keys and rever= se LUT values. So instead of returning Cursors over two kinds of Tuples ove= r these LUTs, their Cursors return IndexEntry objects which swap Tuple key/= value pairs where appropriate.

For example, let's presume the foo index contains String values f= or the foo attributeType. The forward LUT will map Strings representing fo= o attribute values like 'bar' to Long identifiers pointing to entries in th= e MasterTable. The reverse LUT will map Long identifiers pointing to entri= es in the MasterTable to Strings representing values of the foo attribute i= n that entry. The Tuples of these LUT's look like so:

  • Forward LUT =3D> Tuple<String,Long>
  • =09
  • Reverse LUT =3D> Tuple<Long,String>

Index Cursors walk these LUT's using Table Cursors that return Tuples. R= egardless of whether the forward LUT or the reverse LUT is used, Tuples are= transduced into IndexEntry objects which always return the Long identifier= into the master table on calls to getId() and the attribute value String (= or whatever - really a generic type - String is used here for the foo attri= bute example) on calls to getValue(). This way Index Cursors do not care i= f the underlying Tuple Cursor walks the forward LUT or the reverse LUT.

Example Directory Information Tree (DIT)

We'll use this example DIT for the purpose of our discussions around ind= ices.

Ind= ex Normalization, and Matching

An Index can be built on any attribute. For example we can build indices= on the ou, objectClass, and cn attributes in our example. Here's what the= forward lookup tables of these indices would contain:

Here we have an example tree where the nodes are labeled with their user= provided relative distinguished names at the point of creation. The numbe= rs in the nodes represent the order of creation and the unique identifiers = of these entries representing the Tuple keys in the master table. We'll use= this example and modified versions of it throughout this document. So let'= s take a look at how the MasterTable would look but presume the distinguish= ed names to the right represent the full serialized entry as an array of by= tes.

3D""Remember an Index is bidirectional! Thi= s means we can lookup all the entry identifiers representing the set of ent= ries with value 'bar' for the attribute foo using the forward LUT in= the Index. We can also use the reverse LUT to lookup an id, to see all th= e values of attribute foo an entry may contain.

OID Alias top organization organizationalUnit person organizationalPerson

The index keys contain attribute values, and the values of the index con= tain entry identifiers which refer back into the master table. Notice the v= alues of attributes in these indices have been transformed. This transform= ation is referred to as normalization or canonicalization. This means the = value has been reduced to a from where variations like for example white sp= ace and case, or aliases have been removed. Other forms of normalization be= sides this textual normalization may exist. All variations of an attribute= 's value considered to be equal are reduce down to this common unambiguous = form.

The transformation is governed by the equality match matchingRules assoc= iated with the attributeType. You'll notice for example the ou and <= b>cn attribute matchingRules result in transformation that remove white= space variance while preserving tokenization: meaning we take out whitespac= e by reducing multiple whitespace characters into a single space. Also the= case variance is also eliminated: all values are reduced to lowercase. No= te this is possible because both ou and cn are case insensiti= ve since they use a matchingRule which ignores case: their equality matchin= gRule is caseIgnoreMatch.

Attribute matchingRules may not all be this trivial. For example the at= tribute may have a complex binary syntax and there might be some intricate = operation required to reduce the value into a canonical representation. At= other times normalization may involve transforming ambiguous representatio= ns into a unique identifier that can never have variance. The objectCla= ss Index is a good example of this. The objectClass attribute lists th= e objectClasses associated with an entry. Unfortunately LDAP allows alias = names for objectClasses as well as other entities like attributeTypes (i.e.= cn verses commonName). These aliases are human readable representations b= ut they can have case variance as well and any alias can be used. Most sch= ema entities in LDAP have an Object IDentifier or OID. OID's are be defaul= t canonical. As you can see this is the canonical representation used in t= he objectClass index. Incidentally, users are even allowed to plugi= n OID's for the value of the objectClass attribute if they wanted to.

Notice Index Tuple keys are all sorted and so are the Tuple values of th= e same key. Indices use matchingRules as well to order both keys and value= s based on matchingRules in both the forward and reverse LUTs.

If you look at the objectClass index you'll s= ee that the OID ( for the 'top' objectClass is not included in the = index. This is because every entry will contain 'top' so there's no point = to bloating the index including it as a value. If we need to walk all the e= ntries in the server we can walk the MasterTable or one of the system indic= es like the ndn or updn indices described h= ere.

MatchingRules, Comparators and Normalizers

LDAP defines 3 different kinds of matchingRules for attributeTypes: equa= lity, ordering and substring. The equality matchingRule governs how attrib= ute values are compared for equality. Ordering matchingRules control how a= ttribute values are compared to see if one value is greater or less than an= other. The substring matching rule manages how embedded fields in larger v= alues can be matched for patterns. Most LDAP servers have different kinds = of indices you can create for each of these kinds of matching techniques. = Sometimes there's more for example some servers have an approximate match i= ndex, which uses the soundex algorithm or a derivative to determine matches= .

ApacheDS has only one type of index and makes some trade offs in space, = time and simplicity while applying some common sense. ApacheDS functionall= y models a matchingRule as a Java Comparator coupled with a Normalizer. Th= e Normalizer applies the transformation required to produce a canonical for= m. The Comparator performs checks to see if values are equal, greater or l= ess than one another. ApacheDS Index objects use the equality matchingRule= 's Normalizer to create index Tuples and the ordering matchingRule's Compar= ator (only if defined) to determine insertion order into Index LUT BTrees. = If there exists no ordering matchingRule for the attributeType then the Co= mparator for the equality matchingRule is used instead.

If a search uses an equality assertion with the index attribute lookups = return the answers we need. If a search filter uses a substring assertion t= hen depending on the pattern the substring matchingRules are used (only if = defined for the attributeType) with a bit more processing using regular exp= ressions to match normalized values on index walks. When > or < filte= r operators are used, and a ordering matchingRule is defined for the attrib= uteType then that matchingRule's comparator is used to compare values on In= dex walks with Cursors. Also because of these choices ApacheDS indices are = bi-directional and so ApacheDS pays a price on write operations to maintain= both forward and reverse LUTs.

ApacheDS does not support approximate match since this kind of matching = is language sensitive and just a pile of horse poop. For this reason appro= ximate match does not work with internationalization. No server has yet to = satisfy my search when I've tried to phonetically find names using approxim= ate match.

So ApacheDS made some choices to stay simple. We applied a pattern of t= hinking where trade offs were made to keep the most common operations faste= st while the least common operations may suffer slightly. Others like appr= oximate match are essentially ignored. ApacheDS processes approximate matc= h based filter terms as if they were equality assertions.

JDBM Bas= e Index Implementation

The JdbmIndex uses two JdbmTables each backed by it's own JDBM BTree in = the same file managed by a JDBM RecordManager for the file. These files en= d with the .db file extension and use the first alias name of the attribute= Type for the file name. The JdbmIndex leverages these JdbmTables for prefo= rming forward reverse lookup operations to assert values as well as for gen= erating Cursors over the Tuples of these Tables.

JdbmIndex instances require schema information to properly initialize th= eir LUTs with the right Comparators. JdbmIndex operations that write Tuple= s into these two JdbmTables normalized values before inserting them. Attri= bute values inserted into these JdbmTables are normalized for speed and eff= iciency. If these values were not normalized, the Cursors walking these Jd= bmTables and lookup operations would need to normalize then compare. Searc= h operations are heavily dependent on Index Cursors and lookup operations a= nd normalization is an expensive process. Why repeat normalization and pay= on each search operation when you can pay the price once on write operatio= ns. After all, this is LDAP, and LDAP is read optimized.

= Possible Repeat Normalization Inefficiency

JdbmIndex tries to protect against lookups without normalization by norm= alizing lookup parameters before consulting the two JdbmTable based LUTs. = It also maintains a cache of normalizations it has performed recently. The= cache in particular may be a waste since Normalizers also try to cache rec= ent normalization operations, and these normalizers are reused by the syste= m. So we may be needlessly double caching and wasting memory.

Another possible problem is that JdbmIndex instance may not need to norm= alize attribute values. There were several discussions about eagerly norma= lizing attribute values in entries especially with the move to the ServerEn= try API. If this happens then we'll potentially have 3 copies of the same n= ormalization result in memory during an operation. Furthermore this remove= s the need to cache at all. A normalization accounting review is required = to make sure we're not wasting extra cycles and memory while defining the c= orrect barriers to protect against non-normalized attribute values.

Eager Normalization Optimizations
When partitions begin to expose access to Index information, the server no= w is privy to information it can use to efficiently conduct eager attribute= value normalization in ServerEntry objects. At the normalization barrier,= which will most likely be in the NormalizationInterceptor, the server can = determine the target partition the entry is associated with. Knowing the t= arget partition, the server can check which attributes are indexed by the p= artition and pre-normalize them. Note that the user provided value is neve= r lost, the ServerEntry stores both normalized and user provided forms. Thi= s way value normalization and caching requirements are lifted from partitio= n indices.