From commits-return-32008-apmail-directory-commits-archive=directory.apache.org@directory.apache.org Fri Jun 24 13:12:22 2011 Return-Path: X-Original-To: apmail-directory-commits-archive@www.apache.org Delivered-To: apmail-directory-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E244C63D9 for ; Fri, 24 Jun 2011 13:12:22 +0000 (UTC) Received: (qmail 57959 invoked by uid 500); 24 Jun 2011 13:12:22 -0000 Delivered-To: apmail-directory-commits-archive@directory.apache.org Received: (qmail 57896 invoked by uid 500); 24 Jun 2011 13:12:22 -0000 Mailing-List: contact commits-help@directory.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@directory.apache.org Delivered-To: mailing list commits@directory.apache.org Received: (qmail 57889 invoked by uid 99); 24 Jun 2011 13:12:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 13:12:22 +0000 X-ASF-Spam-Status: No, hits=-1994.3 required=5.0 tests=ALL_TRUSTED,HTML_FONT_LOW_CONTRAST,HTML_MESSAGE,MIME_HTML_ONLY X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 Jun 2011 13:12:20 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id p5ODC0lg021058 for ; Fri, 24 Jun 2011 13:12:00 GMT Date: Fri, 24 Jun 2011 09:12:00 -0400 (EDT) From: confluence@apache.org To: commits@directory.apache.org Message-ID: <7880400.4689.1308921120050.JavaMail.confluence@thor> Subject: [CONF] Apache Directory Server v1.5 > Index and IndexEntry MIME-Version: 1.0 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Auto-Submitted: auto-generated

Index and IndexEntry

Page edited by Emmanuel L=C3=A9charny


Changes (1)

=20 =20
=20 =20
...
The *system* indexes are thos= e we create at startup. An administrator can modify some of their character= istics, but basically, they will always exist. The list of existing System = index is given in the following table :

|| Index || Key || duplic= ates | || Description ||
| ObjectClass | String | Allo= wed | Associate an ObjectClass to a list of entries ID |
| Rdn | Parent= IdAndRdn | Forbidden | Stores the relation between a child entry and its pa= rent in the DIT. The key is a compound element built using the entry's = RDN and the entry ID. |
...

Full Content

Index Abstr= action

An Index is a bidirectional LUT (Look-Ups Table) which sorts keys (forwa= rd index) and values (reverse index). It can be built on any physical attr= ibute or a computed characteristic of an entry.

Index Tuples contain a value for the attribute and the entry identifier = which represents a pointer into the MasterTable. This way lookups into the= master table for entries with a specific attribute value are fast. For ex= ample, when conducting a search for all entries where the foo attrib= ute is equal to the String 'bar', the search engine looks first to see if a= n index is available for foo. If an Index on foo exists then= it is used to quickly list the identifiers of all entries with attribute <= b>foo equal to 'bar'. This lookup occurs in almost constant time and t= he Index can be walked to only return those identifiers over the Tuples wit= h key set to 'bar'. The following schema expose the relations between all t= hose elements :

Without an Index on foo, the following sequence of events must oc= cur on each entry in the master table:

    =09
  1. the entry must be read from the MasterTable as a byte[]
  2. =09
  3. the entry must then be deserialized
  4. =09
  5. the foo attribute must be extracted
  6. =09
  7. each value of the foo attribute must be compared to see if it= equals 'bar'

As you can see this is an expensive proposition. With very large directo= ries search may take a very long time or may return without the intended en= tries because time limits are reached. Indices are critical for increasing = search performance.

3D""Even if we have all the entries in the c= ache, this is still an O(n) operation. We just spare the disk acces= s and the deserialization phase.
LDAP amortizes the expense of building indice= s at add, modify and delete times in return for faster search performance. = LDAP servers for this reason are highly indexed and are better suited for = read intensive applications.

Index F= orward and Reverse tables

At the moment, an Index is implemented using 2 tables :

    =09
  • The Forward table, which maps the key to a list of entry's ID in the= master table
  • =09
  • The Reverse table, which maps the entry's ID to the keys

The forward table is used to retrieve the entry from the MasterTable, wh= en we only have a key. The key can be associate with more than one entry, s= o we will get back a cursor on the associated entry's ID when we do a looku= p.

The reverse index is used when we do a search on a combination of attrib= utes : (&(cn=3Dtest)(sn=3Dacme)). If we suppose that the selected prima= ry index will be the CN index, then we will pull a list of entry's I= D. Now, using the SN reverse index, we can avoid pulling each entry = from the MasterTable, simply because we can check in the SN reverse = table that the selected IDs are present or not.

3D""This might be superfluous : it's enough = to pull the SN entry's ID for the acme value, and do an inter= section with the first set of entry's ID. The biggest advantage would be th= at we will speed up the modification operations if we were to remove those = reverse tables.

= Index Cursor's Get IndexEntry Objects

Index Cursors, unlike Tables that return Tuples, return IndexEntry objec= ts instead on get() operations after positioning the Cursor. Indices always= use a Long identifier into the master table for forward LUT keys and rever= se LUT values. So instead of returning Cursors over two kinds of Tuples ove= r these LUTs, their Cursors return IndexEntry objects which swap Tuple key/= value pairs where appropriate.

For example, let's presume the foo index contains String values f= or the foo attributeType. The forward LUT will map Strings representing fo= o attribute values like 'bar' to Long identifiers pointing to entries in th= e MasterTable. The reverse LUT will map Long identifiers pointing to entri= es in the MasterTable to Strings representing values of the foo attribute i= n that entry. The Tuples of these LUT's look like so:

    =09
  • Forward LUT =3D> Tuple<String,Long>
  • =09
  • Reverse LUT =3D> Tuple<Long,String>

Index Cursors walk these LUT's using Table Cursors that return Tuples. R= egardless of whether the forward LUT or the reverse LUT is used, Tuples are= transduced into IndexEntry objects which always return the Long identifier= into the master table on calls to getId() and the attribute value String (= or whatever - really a generic type - String is used here for the foo attri= bute example) on calls to getValue(). This way Index Cursors do not care i= f the underlying Tuple Cursor walks the forward LUT or the reverse LUT.

Index used in t= he server

We used two distinct kind of index in the server :

    =09
  • system indexes
  • =09
  • user indexes

The system indexes are those we create at startup. An administrat= or can modify some of their characteristics, but basically, they will alway= s exist. The list of existing System index is given in the following table = :

Index Key duplicates Description
ObjectClass String Allowed Associate an ObjectClass to a list of entries I= D
Rdn ParentIdAndRdn Forbidden Stores the relation between a child entry and i= ts parent in the DIT. The key is a compound element built using the entry's= RDN and the entry ID.
oneLevel Long (ID) Allowed An index storing a relation between an entry an= d all its children
subLevel Long (ID) Allowed An index storing a relation between an entry an= d all its children, recursively
presence String Allowed An index storing a relation between an Attribut= eType name and the list of entry's ID having this AttributeType
alias String Forbidden Stores the relation between an Alias and the ta= rgetted entry ID
oneAlias Long (ID) Allowed Stores the list of aliases under an alias
subAlias Long (ID) Allowed Stores the list of aliases under an alias with = no limit in depth
entryUUID String (UUID) Forbidden Stores the relation between an UUID and the ass= ociated entry ID
entryCSN String (CSN) Forbidden Stores the relation between a CSN and the assoc= iated entry ID
3D""The Alias, oneAlias and subAlias indexes= are probably useless.

Example Directory Information Tree (DIT)

We'll use this example DIT for the purpose of our discussions around ind= ices.

Ind= ex Normalization, and Matching

An Index can be built on any attribute. For example we can build indices= on the ou, objectClass, and cn attributes in our example. Here's what the= forward lookup tables of these indices would contain:

Here we have an example tree where the nodes are labeled with their user= provided relative distinguished names at the point of creation. The numbe= rs in the nodes represent the order of creation and the unique identifiers = of these entries representing the Tuple keys in the master table. We'll use= this example and modified versions of it throughout this document. So let'= s take a look at how the MasterTable would look but presume the distinguish= ed names to the right represent the full serialized entry as an array of by= tes.

3D""Remember an Index is bidirectional! Thi= s means we can lookup all the entry identifiers representing the set of ent= ries with value 'bar' for the attribute foo using the forward LUT in= the Index. We can also use the reverse LUT to lookup an id, to see all th= e values of attribute foo an entry may contain.

OID Alias
2.5.6.0 top
2.5.6.4 organization
2.5.6.5 organizationalUnit
2.5.6.6 person
2.5.6.7 organizationalPerson

The index keys contain attribute values, and the values of the index con= tain entry identifiers which refer back into the master table. Notice the v= alues of attributes in these indices have been transformed. This transform= ation is referred to as normalization or canonicalization. This means the = value has been reduced to a from where variations like for example white sp= ace and case, or aliases have been removed. Other forms of normalization be= sides this textual normalization may exist. All variations of an attribute= 's value considered to be equal are reduce down to this common unambiguous = form.

The transformation is governed by the equality match matchingRules assoc= iated with the attributeType. You'll notice for example the ou and <= b>cn attribute matchingRules result in transformation that remove white= space variance while preserving tokenization: meaning we take out whitespac= e by reducing multiple whitespace characters into a single space. Also the= case variance is also eliminated: all values are reduced to lowercase. No= te this is possible because both ou and cn are case insensiti= ve since they use a matchingRule which ignores case: their equality matchin= gRule is caseIgnoreMatch.

Attribute matchingRules may not all be this trivial. For example the at= tribute may have a complex binary syntax and there might be some intricate = operation required to reduce the value into a canonical representation. At= other times normalization may involve transforming ambiguous representatio= ns into a unique identifier that can never have variance. The objectCla= ss Index is a good example of this. The objectClass attribute lists th= e objectClasses associated with an entry. Unfortunately LDAP allows alias = names for objectClasses as well as other entities like attributeTypes (i.e.= cn verses commonName). These aliases are human readable representations b= ut they can have case variance as well and any alias can be used. Most sch= ema entities in LDAP have an Object IDentifier or OID. OID's are be defaul= t canonical. As you can see this is the canonical representation used in t= he objectClass index. Incidentally, users are even allowed to plugi= n OID's for the value of the objectClass attribute if they wanted to.

Notice Index Tuple keys are all sorted and so are the Tuple values of th= e same key. Indices use matchingRules as well to order both keys and value= s based on matchingRules in both the forward and reverse LUTs.

If you look at the objectClass index you'll s= ee that the OID (2.5.6.0) for the 'top' objectClass is not included in the = index. This is because every entry will contain 'top' so there's no point = to bloating the index including it as a value. If we need to walk all the e= ntries in the server we can walk the MasterTable or one of the system indic= es like the ndn or updn indices described h= ere.

MatchingRules, Comparators and Normalizers

LDAP defines 3 different kinds of matchingRules for attributeTypes: equa= lity, ordering and substring. The equality matchingRule governs how attrib= ute values are compared for equality. Ordering matchingRules control how a= ttribute values are compared to see if one value is greater or less than an= other. The substring matching rule manages how embedded fields in larger v= alues can be matched for patterns. Most LDAP servers have different kinds = of indices you can create for each of these kinds of matching techniques. = Sometimes there's more for example some servers have an approximate match i= ndex, which uses the soundex algorithm or a derivative to determine matches= .

ApacheDS has only one type of index and makes some trade offs in space, = time and simplicity while applying some common sense. ApacheDS functionall= y models a matchingRule as a Java Comparator coupled with a Normalizer. Th= e Normalizer applies the transformation required to produce a canonical for= m. The Comparator performs checks to see if values are equal, greater or l= ess than one another. ApacheDS Index objects use the equality matchingRule= 's Normalizer to create index Tuples and the ordering matchingRule's Compar= ator (only if defined) to determine insertion order into Index LUT BTrees. = If there exists no ordering matchingRule for the attributeType then the Co= mparator for the equality matchingRule is used instead.

If a search uses an equality assertion with the index attribute lookups = return the answers we need. If a search filter uses a substring assertion t= hen depending on the pattern the substring matchingRules are used (only if = defined for the attributeType) with a bit more processing using regular exp= ressions to match normalized values on index walks. When > or < filte= r operators are used, and a ordering matchingRule is defined for the attrib= uteType then that matchingRule's comparator is used to compare values on In= dex walks with Cursors. Also because of these choices ApacheDS indices are = bi-directional and so ApacheDS pays a price on write operations to maintain= both forward and reverse LUTs.

ApacheDS does not support approximate match since this kind of matching = is language sensitive and just a pile of horse poop. For this reason appro= ximate match does not work with internationalization. No server has yet to = satisfy my search when I've tried to phonetically find names using approxim= ate match.

So ApacheDS made some choices to stay simple. We applied a pattern of t= hinking where trade offs were made to keep the most common operations faste= st while the least common operations may suffer slightly. Others like appr= oximate match are essentially ignored. ApacheDS processes approximate matc= h based filter terms as if they were equality assertions.

JDBM Bas= e Index Implementation

The JdbmIndex uses two JdbmTables each backed by it's own JDBM BTree in = the same file managed by a JDBM RecordManager for the file. These files en= d with the .db file extension and use the first alias name of the attribute= Type for the file name. The JdbmIndex leverages these JdbmTables for prefo= rming forward reverse lookup operations to assert values as well as for gen= erating Cursors over the Tuples of these Tables.

JdbmIndex instances require schema information to properly initialize th= eir LUTs with the right Comparators. JdbmIndex operations that write Tuple= s into these two JdbmTables normalized values before inserting them. Attri= bute values inserted into these JdbmTables are normalized for speed and eff= iciency. If these values were not normalized, the Cursors walking these Jd= bmTables and lookup operations would need to normalize then compare. Searc= h operations are heavily dependent on Index Cursors and lookup operations a= nd normalization is an expensive process. Why repeat normalization and pay= on each search operation when you can pay the price once on write operatio= ns. After all, this is LDAP, and LDAP is read optimized.

= Possible Repeat Normalization Inefficiency

JdbmIndex tries to protect against lookups without normalization by norm= alizing lookup parameters before consulting the two JdbmTable based LUTs. = It also maintains a cache of normalizations it has performed recently. The= cache in particular may be a waste since Normalizers also try to cache rec= ent normalization operations, and these normalizers are reused by the syste= m. So we may be needlessly double caching and wasting memory.

Another possible problem is that JdbmIndex instance may not need to norm= alize attribute values. There were several discussions about eagerly norma= lizing attribute values in entries especially with the move to the ServerEn= try API. If this happens then we'll potentially have 3 copies of the same n= ormalization result in memory during an operation. Furthermore this remove= s the need to cache at all. A normalization accounting review is required = to make sure we're not wasting extra cycles and memory while defining the c= orrect barriers to protect against non-normalized attribute values.

Eager Normalization Optimizations
When partitions begin to expose access to Index information, the server no= w is privy to information it can use to efficiently conduct eager attribute= value normalization in ServerEntry objects. At the normalization barrier,= which will most likely be in the NormalizationInterceptor, the server can = determine the target partition the entry is associated with. Knowing the t= arget partition, the server can check which attributes are indexed by the p= artition and pre-normalize them. Note that the user provided value is neve= r lost, the ServerEntry stores both normalized and user provided forms. Thi= s way value normalization and caching requirements are lifted from partitio= n indices.